ScaleML-RLHF/Qwen2.5-Math-1.5B-raft-plusplus-numina_math_em-sample1n16-sample16-iter2 2B • Updated Apr 7, 2025
ScaleML-RLHF/Qwen2.5-Math-1.5B-raft-plusplus-numina_math_em-sample1n16-sample16-iter1 2B • Updated Apr 7, 2025
ScaleML-RLHF/Qwen2.5-Math-7B-grpo-plusplus-numina_math_15_all-n4-step140 8B • Updated Apr 4, 2025 • 3
ScaleML-RLHF/Qwen2.5-Math-7B-grpo-plusplus-numina_math_15_all-n4-step120 8B • Updated Apr 4, 2025 • 7