nightmedia
/

Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-qx86x-hi-mlx

Text Generation

code generation

Mixture of Experts

Qwen3-Coder-30B-A3B-Instruct

mixture of experts

8 active experts

1 million context

optional thinking

8-bit precision

Model card Files Files and versions

nightmedia commited on 4 days ago

Commit

59f9191

·

verified ·

1 Parent(s): 57f8fa9

Update README.md

Files changed (1) hide show

README.md +14 -1

README.md CHANGED Viewed

@@ -40,7 +40,20 @@ pipeline_tag: text-generation
 # Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-qx86x-hi-mlx
-This model [Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-qx86x-hi-mlx](https://huggingface.co/Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-qx86x-hi-mlx) was
 converted to MLX format from [DavidAU/Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG](https://huggingface.co/DavidAU/Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG)
 using mlx-lm version **0.28.3**.

 # Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-qx86x-hi-mlx
+This is a new-old-stock version of the model, with embeddings at 8 bit.
+The original [Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-qx86-hi-mlx](https://huggingface.co/nightmedia/Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-qx86-hi-mlx) is using 6 bit embeddings
+```bash
+Perplexity: 4.431 ± 0.031
+Peak memory: 43.43 GB
+```
+Metrics coming soon. If this proves better than the qx86-hi, it will replace it in the catalog.
+-G
+This model [Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-qx86x-hi-mlx](https://huggingface.co/nightmedia/Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-qx86x-hi-mlx) was
 converted to MLX format from [DavidAU/Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG](https://huggingface.co/DavidAU/Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG)
 using mlx-lm version **0.28.3**.