A collection of mutiple benchmarks for large reasoning model evaluation
datasets-and-models
non-profit
AI & ML interests
None defined yet.
Recent Activity
View all activity
models
53
guanning-ai/0122_p_normalization_32rollouts_1500
0.4B
•
Updated
guanning-ai/0122_pkpo_T16_1500
0.4B
•
Updated
guanning-ai/grpo_64rollouts_1200
0.4B
•
Updated
guanning-ai/grpo_64rollouts_900
0.4B
•
Updated
guanning-ai/grpo_64rollouts_600
0.4B
•
Updated
guanning-ai/grpo_64rollouts_300
0.4B
•
Updated
guanning-ai/grpo_64rollouts_1500
0.4B
•
Updated
guanning-ai/grpo_16rollouts_step1500
Updated
guanning-ai/p_normalization_16rollouts
Updated
guanning-ai/clipped_pnorm
Updated
datasets
138
guanning-ai/gsm8k-platinum
Viewer
•
Updated
•
1.21k
•
9
guanning-ai/math500_level5
Viewer
•
Updated
•
134
•
22
guanning-ai/math500_level4
Viewer
•
Updated
•
128
•
19
guanning-ai/math500_level3
Viewer
•
Updated
•
105
•
19
guanning-ai/math500_level2
Viewer
•
Updated
•
90
•
23
guanning-ai/math500_level1
Viewer
•
Updated
•
43
•
20
guanning-ai/minervamath
Viewer
•
Updated
•
272
•
13
guanning-ai/smollm-gsm8k-data-1024
Viewer
•
Updated
•
7.65M
•
83
guanning-ai/gsm8k-metamath
Viewer
•
Updated
•
160k
•
31
guanning-ai/gsm8k-mumath
Viewer
•
Updated
•
92k
•
24