AI & ML interests
None defined yet.
Recent Activity
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-n8-step120
2B
•
Updated
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-n8-step110
2B
•
Updated
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-n8-step100
2B
•
Updated
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-n8-step90
2B
•
Updated
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-n8-step80
2B
•
Updated
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-n8-step70
2B
•
Updated
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-n8-step10
2B
•
Updated
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-n8-8-iter8
2B
•
Updated
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-n8-8-iter7
2B
•
Updated
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-n8-8-iter6
2B
•
Updated
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-n8-8-iter5
2B
•
Updated
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-n8-8-iter4
2B
•
Updated
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-n8-8-iter3
2B
•
Updated
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-n8-8-iter2
2B
•
Updated
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-n8-8-iter1
2B
•
Updated
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-n8-8-noShuffle-chunk4-iter3
2B
•
Updated
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-n8-8-noShuffle-chunk4-iter2
2B
•
Updated
•
1
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-n8-8-noShuffle-chunk4-iter1
2B
•
Updated
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-chunk4-iter6
2B
•
Updated
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-chunk4-iter5
2B
•
Updated
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-chunk4-iter4
2B
•
Updated
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-chunk4-iter3
2B
•
Updated
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-chunk4-iter2
2B
•
Updated
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-chunk4-iter1
2B
•
Updated
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-plusplus-numina_math_15_all-n32-step40
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-plusplus-numina_math_15_all-n32-step30
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-plusplus-numina_math_15_all-n32-step20
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-plusplus-numina_math_15_all-n32-step10
ScaleML-RLHF/Qwen2.5-Math-1.5B-raft-plusplus-numina_math_em-sample1n16-sample16-iter4
ScaleML-RLHF/Qwen2.5-Math-1.5B-raft-plusplus-numina_math_em-sample1n16-sample16-iter3