Model size 12B
#5
by
amine-khelif
- opened
Why model size says 12B instead of 20B
B ≠ GB
Maybe HF doesn’t handle the new format yet
it is probably due to mixed precision
The expert weights are 4-bit but packed as U8, so you have to double the param count from each of those tensors.
The expert weights are 4-bit but packed as U8, so you have to double the param count from each of those tensors.
Yes, that's correct — we can check the parameter shapes in the model.safetensors.index.json file , and for the U8-packed 4-bit expert weights, we need to double the parameter count accordingly.