view article Article The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix 23 days ago • 45
Pre-training Dataset Samples Collection A collection of pre-training datasets samples of sizes 10M, 100M and 1B tokens. Ideal for use in quick experimentation and ablations. • 19 items • Updated 15 days ago • 14
Running on CPU Upgrade Featured 2.43k The Smol Training Playbook 📚 2.43k The secrets to building world-class LLMs
GPT-OSS General (4.2B to 20B) Collection Collection of pruned GPT-OSS models spanning 1-32 experts, maintaining general capabilities across domains while reducing computational requirements. • 29 items • Updated Aug 13 • 8