SmolLM2-1.7B-Instruct / onnx /model_quantized.onnx

Commit History

Upload optimized ONNX model w/ GQA (#26)
31b70e2
verified

Xenova HF Staff commited on

Fix q8 weights (use uint8 for q8; int8 produces poor results) (#18)
b75eb65
verified

Xenova HF Staff commited on

Upload optimized ONNX weights (deduplicated) (#17)
b36fc77
verified

Xenova HF Staff commited on