TAUR-dev/M-rl_1e_v2__pv_v2-rl__150
2B
•
Updated
•
3
TAUR-dev/M-rl_1e_v2__pv_v2_origonly2e-rl__150
2B
•
Updated
TAUR-dev/M-sft_exp_1e_zayneprompts_v2_orig_only2e-sft
2B
•
Updated
TAUR-dev/M-sft_exp_1e_zayneprompts_v2-sft
2B
•
Updated
TAUR-dev/M-0914_fastrl__1e_3args_dapo-rl
2B
•
Updated
TAUR-dev/M-0914_fastrl__1e_3args_grpo-rl
2B
•
Updated
TAUR-dev/M-rl_1e_v2__pv-rl
2B
•
Updated
TAUR-dev/M-0914_fastrl__0epoch_3args_dapo-rl
2B
•
Updated
TAUR-dev/M-1e_with_gpt4o_reflections-rl
2B
•
Updated
TAUR-dev/M-1e_with_gpt4o_both-rl
2B
•
Updated
TAUR-dev/M-0914_fastrl__0epoch_3args_grpo-rl
2B
•
Updated
TAUR-dev/M-0914_fastrl__0epoch_3args_grpo_notokenmean-rl
Updated
TAUR-dev/M-0914_fastrl__1e_3args_dapo_nods-rl
Updated
TAUR-dev/M-0914_fastrl__0epoch_3args_dapo_nods-rl
Updated
TAUR-dev/M-sft_exp_1e_zayneprompts-sft
2B
•
Updated
TAUR-dev/M-sft_exp_zayneV3_cd3arg_w_gpt4o_both-sft
2B
•
Updated
TAUR-dev/M-sft_exp_zayneV3_1e_cd3arg_w_gpt4o_ref-sft
2B
•
Updated
TAUR-dev/M-SFTV2_V3_rl_RUN__gpt4o_ref-rl
Updated
TAUR-dev/M-SFTV2_V3_rl_RUN__gpt4o_both-rl
Updated
TAUR-dev/M-SFTV2_V3_rl_er_RUN__9_11-rl
Updated
TAUR-dev/M-sft_exp_zayneV2-sft
2B
•
Updated
TAUR-dev/M-0911__0epoch_3args_dapo_nods_50epoch-rl
2B
•
Updated
TAUR-dev/M-0911__0epoch_alltask_dapo_nods_50epoch-rl
Updated
TAUR-dev/M-SFTV2_all_V2_RUN__9_11_w_verdict_reward-rl
Updated
TAUR-dev/M-SFTV2_all_V2_RUN__9_11-rl
Updated
TAUR-dev/M-SFTV2_V2_RUN__9_11-rl
Updated
TAUR-dev/M-0911__zayne_3args_grpo-rl
2B
•
Updated
TAUR-dev/M-0911__qrepeat1_ref5_0C.-C.-C-IC.-CC_3args_grpo-rl
2B
•
Updated
TAUR-dev/M-0911__0epoch_alltask_dapo_50epoch-rl
2B
•
Updated
TAUR-dev/M-0911__qrepeat3_ref5_0C.-C.-C-IC.-CC_3args_grpo-rl
2B
•
Updated