E-GRPO: High Entropy Steps Drive Effective Reinforcement Learning for Flow Models Paper • 2601.00423 • Published 20 days ago • 8
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss Paper • 2512.23447 • Published 23 days ago • 94
Running Featured 1.27k FineWeb: decanting the web for the finest text data at scale 🍷 1.27k Generate high-quality text data for LLMs using FineWeb