70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float Paper β’ 2504.11651 β’ Published Apr 15 β’ 28
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. β’ 26 items β’ Updated May 1 β’ 569