Running MINT: Compute-Optimal Data-Free Mixed-Precision Quantization 🌿 Quantize LLMs to fit a target memory budget
Running MINT: Compute-Optimal Data-Free Mixed-Precision Quantization 🌿 Quantize LLMs to fit a target memory budget
Running 1 SWAN: Data-Free Mixed-Precision Quantization 🦢 1 Quantize LLMs without data using per‑tensor mixed precision