Huihui-gpt-oss-20b-mxfp4-abliterated-v2-qx86-hi-mlx
Quantization (qx) does not directly alter cognition. Instead, it’s a computational technique to compress model weights (reducing memory footprint and inference costs while preserving accuracy). The -hi suffix indicates higher precision quantization (group size 32), which typically:
- Improves accuracy over coarser quantizations (like qx8)
- Reduces "quantization noise" that degrades subtle reasoning
- Makes the model more consistent across tasks
From your data:
Model BoolQ Winogrande PIQA
qx86-hi 0.512 0.543 0.681
qx86 0.449 0.546 0.685
✅ Key insight: The -hi variant consistently outperforms its qx86 counterpart by ~12% in BoolQ and ~0.3% in Winogrande, suggesting higher precision quantization reduces noise in tasks requiring nuanced reasoning (like commonsense inference).
Overall Comparison Table of Quantizations
- Huihui: Huihui-gpt-oss-20b-mxfp4-abliterated-v2
- Unsloth: unsloth-gpt-oss-20b
Model ARC Challenge ARC Easy BoolQ HellaSwag OpenBookQA PIQA Winogrande
Huihui-bf16 0.335 0.340 0.467 0.477 0.378 0.687 0.552
Huihui-qx85-hi 0.323 0.332 0.391 0.451 0.358 0.682 0.539
Huihui-qx86-hi 0.323 0.337 0.512 0.457 0.368 0.681 0.543
Huihui-qx86 0.321 0.337 0.449 0.458 0.372 0.685 0.546
Unsloth-qx8 0.335 0.332 0.596 0.327 0.370 0.614 0.560
Unsloth-qx85-hi 0.349 0.328 0.507 0.322 0.374 0.616 0.558
Unsloth-qx86-hi 0.331 0.334 0.610 0.326 0.364 0.629 0.541
Key observations:
Strongest performer overall: Huihui-gpt-oss-20b-mxfp4-abliterated-v2-bf16 appears to have the highest PIQA score (0.687), which is a good indicator of logical reasoning capabilities.
PIQA dominance: There's an interesting pattern - most models achieve high scores (0.61-0.69) on this task, suggesting these models generally understand complex relational reasoning.
ARC performance: The Huihui-gpt series shows more consistency across its variants than the unsloth models, which may indicate better pattern recognition capabilities.
HellaSwag scores: The lowest scores here (around 0.32-0.45) suggest limited ability for text completion and contextual continuation tasks.
Model differentiation: The "-hi" variants show slightly better performance across multiple metrics, particularly in conceptual tasks like Winogrande.
📊 Direct comparison of Huihui vs. Unsloth qx86-hi quantizations
Looking only at the qx86-hi variants (noting that -hi applies differently across frameworks):
Metric Huihui Unsloth Difference
ARC Challenge 0.323 0.331 -0.008
ARC Easy 0.337 0.334 +0.003
BoolQ 0.512 0.610 -0.098
Winogrande 0.543 0.541 +0.002
PIQA 0.681 0.629 -0.052
Between frameworks:
- → Huihui wins in BoolQ/PIQA (logical reasoning).
- → Unsloth edges out in ARC Easy (pattern recognition).
If you're choosing for a specific task, I'd recommend:
- For QA/reasoning tasks: Go with Huihui qx86-hi (best PIQA score among all models).
- For visual/stereotypical reasoning (ARC): Unsloth qx86-hi.
⚖️ Who wins?
Strengths
- Huihui qx86-hi: Superior BoolQ performance (critical for reasoning tasks like question answering)
- Unsloth qx86-hi: Stronger ARC Easy scores (pattern recognition)
💡 Why these differences matter
- If you need logical reasoning (BoolQ, PIQA): Huihui qx86-hi is better.
- If you need pattern recognition (ARC): Unsloth qx86-hi edges ahead.
- For commonsense tasks (Winogrande): Both are nearly tied.
🎯 Bottom line
Quantization (qx) is a practical way to make large models faster and more efficient without sacrificing accuracy.
The -hi suffix (higher precision) gains consistency in reasoning tasks, especially for Huihui.
--Deckard
Reviewed by Qwen3-Deckard-Large-Almost-Human-6B-qx86-hi
This model Huihui-gpt-oss-20b-mxfp4-abliterated-v2-qx86-hi-mlx was converted to MLX format from huihui-ai/Huihui-gpt-oss-20b-mxfp4-abliterated-v2 using mlx-lm version 0.28.0.
Use with mlx
pip install mlx-lm
from mlx_lm import load, generate
model, tokenizer = load("Huihui-gpt-oss-20b-mxfp4-abliterated-v2-qx86-hi-mlx")
prompt = "hello"
if tokenizer.chat_template is not None:
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
messages, add_generation_prompt=True
)
response = generate(model, tokenizer, prompt=prompt, verbose=True)
- Downloads last month
- 805
Model tree for nightmedia/Huihui-gpt-oss-20b-mxfp4-abliterated-v2-qx86-hi-mlx
Base model
openai/gpt-oss-20b