Running
1
SWAN: Data-Free Mixed-Precision Quantization
🦢
Quantize large language models without data using mixed‑precision
Model Quantization
We build open tools for efficient AI deployment. Our research focuses on quantization methods that preserve model quality while dramatically reducing hardware requirements — bringing 400B+ parameter models to a single machine.
baa.ai · SWAN Paper · GitHub
Quantize large language models without data using mixed‑precision
Generate quantization‑ready student models via guided distillation
Train LLMs to be quantization‑ready with sensitivity‑aware methods