Quantization was performed using exllama3 v0.0.22.

Quant Size (GB) KL-div (quant, orig) KL-div (orig, quant) Perplexity Top-K K=1 Top-K K=2 Top-K K=3 Top-K K=4 Top-K K=5
2.0bpw 55 0.36735150 0.42469226 9.46492433 0.7699 0.4340 0.2006 0.0796 0.0289
3.0bpw 82 0.14842009 0.15566614 8.74921130 0.8640 0.6125 0.3773 0.2072 0.1040
4.0bpw 108 0.07256054 0.07650418 8.43832064 0.9118 0.7281 0.5222 0.3439 0.2105
5.0bpw 135 0.04801990 0.04921814 8.35222293 0.9344 0.7901 0.6154 0.4472 0.3056
6.0bpw 161 0.04015230 0.04071388 8.35782554 0.9449 0.8209 0.6651 0.5071 0.3670
7.0bpw 188 0.03484128 0.03757493 8.35427106 0.9510 0.8380 0.6922 0.5420 0.4046
8.0bpw 214 0.03227931 0.03371121 8.33833098 0.9533 0.8440 0.7042 0.5587 0.4226
original 214 - - 8.34981264 - - - - -
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for NeuroSenko/MiniMax-M2.5-exl3

Quantized
(39)
this model