view article Article Wan2.1 + DFloat11 Enables High-quality Text-to-video With 24GB VRAM By LeanQuant โข May 24 โข 1
DFloat11 | FLUX.1 Collection Losslessly compressed FLUX.1: requires < 20GB VRAM to run. โข 6 items โข Updated Jul 5 โข 1
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float Paper โข 2504.11651 โข Published Apr 15 โข 29
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models Paper โข 2503.16419 โข Published Mar 20 โข 76
Llama Collection All our SOTA Llama models that crush competition :) โข 6 items โข Updated Nov 5, 2024 โข 1