Running 3.7k The Ultra-Scale Playbook 🌌 3.7k The ultimate guide to training LLM on large GPU Clusters
Running on CPU Upgrade Featured 3k The Smol Training Playbook 📚 3k The secrets to building world-class LLMs
deepseek-ai/DeepSeek-V2-Chat-0628 Text Generation • 236B • Updated Jul 18, 2024 • 3.86k • 177
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 26 items • Updated May 1, 2025 • 574
The Flan Collection: Designing Data and Methods for Effective Instruction Tuning Paper • 2301.13688 • Published Jan 31, 2023 • 9
Flan-T5 release Collection The Flan-T5 covers 4 checkpoints of different sizes each time. It also includes upgrades versions trained using Universal sampling • 7 items • Updated Jul 10, 2025 • 33