Running Featured 1.25k FineWeb: decanting the web for the finest text data at scale 🍷 1.25k Generate high-quality text data for LLMs using FineWeb
GateBreaker: Gate-Guided Attacks on Mixture-of-Expert LLMs Paper • 2512.21008 • Published 14 days ago • 3
Valori: A Deterministic Memory Substrate for AI Systems Paper • 2512.22280 • Published 13 days ago • 3
TinyStories: How Small Can Language Models Be and Still Speak Coherent English? Paper • 2305.07759 • Published May 12, 2023 • 38
Running 97 The Eiffel Tower Llama 📝 97 Explore the Eiffel Tower Llama experiment with open-source models
Olmo 3.1 Collection The latest members of the Olmo 3 family: another 3 weeks of RL for 32B Think, the 32B Instruct model, large post-training research datasets... • 9 items • Updated 15 days ago • 42
The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text Paper • 2506.05209 • Published Jun 5, 2025 • 59