Tiny Series Tiny datasets that empower the foundation of Small Language Model! nampdn-ai/tiny-strange-textbooks Viewer • Updated Feb 2, 2024 • 1M • 191 • 92 nampdn-ai/tiny-textbooks Viewer • Updated Jul 3, 2024 • 420k • 321 • 159 nampdn-ai/tiny-codes Viewer • Updated Sep 30, 2023 • 1.63M • 463 • 262 nampdn-ai/tiny-math-textbooks Viewer • Updated Jan 27, 2024 • 635k • 48 • 24
Mini Pretrain Datasets nampdn-ai/mini-fineweb Viewer • Updated Mar 4 • 291M • 46 • 25 nampdn-ai/mini-peS2o Viewer • Updated Feb 6, 2024 • 1.91M • 14 • 10 nampdn-ai/mini-pubmed Viewer • Updated Sep 8, 2023 • 17k • 1 • 5 nampdn-ai/mini-proofpile Viewer • Updated Sep 5, 2023 • 221k • 2 • 7
Tiny Series Tiny datasets that empower the foundation of Small Language Model! nampdn-ai/tiny-strange-textbooks Viewer • Updated Feb 2, 2024 • 1M • 191 • 92 nampdn-ai/tiny-textbooks Viewer • Updated Jul 3, 2024 • 420k • 321 • 159 nampdn-ai/tiny-codes Viewer • Updated Sep 30, 2023 • 1.63M • 463 • 262 nampdn-ai/tiny-math-textbooks Viewer • Updated Jan 27, 2024 • 635k • 48 • 24
Mini Pretrain Datasets nampdn-ai/mini-fineweb Viewer • Updated Mar 4 • 291M • 46 • 25 nampdn-ai/mini-peS2o Viewer • Updated Feb 6, 2024 • 1.91M • 14 • 10 nampdn-ai/mini-pubmed Viewer • Updated Sep 8, 2023 • 17k • 1 • 5 nampdn-ai/mini-proofpile Viewer • Updated Sep 5, 2023 • 221k • 2 • 7