HeartMuLa: A Family of Open Sourced Music Foundation Models Paper • 2601.10547 • Published 10 days ago • 37
Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models Paper • 2601.07372 • Published 14 days ago • 36
view article Article The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix Nov 3, 2025 • 56
ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation Paper • 2510.04290 • Published Oct 5, 2025 • 19
The End of Manual Decoding: Towards Truly End-to-End Language Models Paper • 2510.26697 • Published Oct 30, 2025 • 117
Running on CPU Upgrade Featured 2.92k The Smol Training Playbook 📚 2.92k The secrets to building world-class LLMs
Video-Thinker: Sparking "Thinking with Videos" via Reinforcement Learning Paper • 2510.23473 • Published Oct 27, 2025 • 85