How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study Paper • 2404.14047 • Published Apr 22, 2024 • 45
A Survey of Low-bit Large Language Models: Basics, Systems, and Algorithms Paper • 2409.16694 • Published Sep 25, 2024
QVGen: Pushing the Limit of Quantized Video Generative Models Paper • 2505.11497 • Published May 16, 2025 • 4
DB-LLM: Accurate Dual-Binarization for Efficient LLMs Paper • 2402.11960 • Published Feb 19, 2024 • 3
LinVideo: A Post-Training Framework towards O(n) Attention in Efficient Video Generation Paper • 2510.08318 • Published Oct 9, 2025
Light Forcing: Accelerating Autoregressive Video Diffusion via Sparse Attention Paper • 2602.04789 • Published 8 days ago • 3
LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit Paper • 2405.06001 • Published May 9, 2024