view article Article Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face By abidlabs and 4 others • 11 days ago • 143
view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others • Jul 8 • 614
view article Article The Transformers Library: standardizing model definitions By lysandre and 3 others • May 15 • 116
EuroBERT Collection Scaling Multilingual Encoders for European Languages • 4 items • Updated Mar 10 • 13
🧠 Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community • 24 items • Updated May 19 • 162
view article Article FastRTC: The Real-Time Communication Library for Python By freddyaboulton and 1 other • Feb 25 • 172
view article Article From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub By jsulz and 3 others • Feb 12 • 71
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 241
view article Article Open-source DeepResearch – Freeing our search agents By m-ric and 4 others • Feb 4 • 1.28k
view article Article Welcome to Inference Providers on the Hub 🔥 By julien-c and 6 others • Jan 28 • 486
view article Article Open-R1: a fully open reproduction of DeepSeek-R1 By eliebak and 2 others • Jan 28 • 876
view article Article Mastering Long Contexts in LLMs with KVPress By nvidia and 1 other • Jan 23 • 69
view article Article Fine-tune ModernBERT for RAG with Synthetic Data By sdiazlor and 2 others • Jan 20 • 41
view article Article Train 400x faster Static Embedding Models with Sentence Transformers By tomaarsen • Jan 15 • 201
OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data Paper • 2410.01560 • Published Oct 2, 2024 • 4
Trans-Tokenization and Cross-lingual Vocabulary Transfers: Language Adaptation of LLMs for Low-Resource NLP Paper • 2408.04303 • Published Aug 8, 2024 • 23