Seems all mlx-community Qwen3 models have an error inside the template (Jinja).Also mlx3bit models for 30B and 32B have a lot of hallucinations. I tested Q3 GGUF and it works better, but at half the speed of the MLX version.
· Sign up or log in to comment