Running
MINT: Compute-Optimal Data-Free Mixed-Precision Quantization
🌿
Optimize LLM size with data‑free mixed‑precision quantization
Model Quantization
Optimize LLM size with data‑free mixed‑precision quantization
Quantize LLMs without data using per‑tensor mixed precision
Generate quantization‑ready student models via guided distillation
Train LLMs to be quantization‑ready with sensitivity‑aware methods