Pretrained models from scratch used in "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining".
Rosie Zhao
rosieyzh
·
AI & ML interests
theory of machine learning, deep learning
Organizations
OLMo-1B-as_fm3_tg_omi1_omi2
OLMo 1B model pretrained with Algebraic Stack, FineMath3, TinyGSM, OMI1, and OMI2. Includes checkpoints from doing PPO using GSM8K train.
-
rosieyzh/OLMo-1B-as_fm3_tg_omi1_omi2_ppo
Text Generation • 1B • Updated • 11 -
rosieyzh/OLMo-1B-as_fm3_tg_omi1_omi2_episode1
Text Generation • 1B • Updated • 4 -
rosieyzh/OLMo-1B-as_fm3_tg_omi1_omi2_episode2
Text Generation • 1B • Updated • 6 -
rosieyzh/OLMo-1B-as_fm3_tg_omi1_omi2_episode3
Text Generation • 1B • Updated • 5
OLMo-150M and OLMo-1B Pretrained Models
Pretrained models from scratch used in "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining".
OLMo-1B-as_fm3_tg_omi1_omi2
OLMo 1B model pretrained with Algebraic Stack, FineMath3, TinyGSM, OMI1, and OMI2. Includes checkpoints from doing PPO using GSM8K train.
-
rosieyzh/OLMo-1B-as_fm3_tg_omi1_omi2_ppo
Text Generation • 1B • Updated • 11 -
rosieyzh/OLMo-1B-as_fm3_tg_omi1_omi2_episode1
Text Generation • 1B • Updated • 4 -
rosieyzh/OLMo-1B-as_fm3_tg_omi1_omi2_episode2
Text Generation • 1B • Updated • 6 -
rosieyzh/OLMo-1B-as_fm3_tg_omi1_omi2_episode3
Text Generation • 1B • Updated • 5
OLMo-1B-as_fm3_tg_omi2
OLMo 1B model pretrained with Algebraic Stack, FineMath3, TinyGSM, and OpenMathInstruct2. Includes checkpoints from doing PPO using GSM8K train.