Idea: Train A 1.5B Model On Math Data

#4
by Tralalabs - opened

Pretraining Datasets: HuggingFaceTB/finemath, HuggingFaceFW/fineweb-edu (keeps knowledge alive), GAIR/MathPile, open-web-math/OpenWebMath, nvidia/Nemotron-CC-Math-v1, HuggingFaceTB/smollm-corpus
Finetuning Datasets: allenai/llama-3.1-tulu-3-8b-preference-mixture, allenai/tulu-3-sft-mixture (filter to English rows), AI-MO/NuminaMath-TIR, meta-math/MetaMathQA, openai/gsm8k, nvidia/OpenMathInstruct-2
Context Window: 32K (32,768)

Sign up or log in to comment