Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
rzzhan
's Collections
ExGRPO
ExGRPO
updated
19 days ago
Model collections trained using ExGRPO.
Upvote
1
rzzhan/ExGRPO-Qwen2.5-Math-7B-Zero
8B
•
Updated
19 days ago
•
13
rzzhan/ExGRPO-LUFFY-7B-Continual
8B
•
Updated
19 days ago
•
14
rzzhan/ExGRPO-Qwen2.5-7B-Instruct
8B
•
Updated
19 days ago
•
15
rzzhan/ExGRPO-Qwen2.5-Math-1.5B-Zero
2B
•
Updated
19 days ago
•
16
rzzhan/ExGRPO-Llama3.1-8B-Zero
8B
•
Updated
19 days ago
•
11
rzzhan/ExGRPO-Llama3.1-8B-Instruct
8B
•
Updated
19 days ago
•
7
ExGRPO: Learning to Reason from Experience
Paper
•
2510.02245
•
Published
19 days ago
•
76
Upvote
1
Share collection
View history
Collection guide
Browse collections