Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
sravanthib
/
Final-try-Llama3.1-8b-instruct-RL
like
0
Text Generation
Transformers
Safetensors
DigitalLearningGmbH/MATH-lighteval
llama
Generated from Trainer
open-r1
trl
grpo
conversational
text-generation-inference
arxiv:
2402.03300
Model card
Files
Files and versions
xet
Community
Train
Deploy
Use this model
main
Final-try-Llama3.1-8b-instruct-RL
Commit History
End of training
a9d3760
verified
sravanthib
commited on
Mar 11
Model save
afc6261
verified
sravanthib
commited on
Mar 11
Training in progress, step 58
cc26bd8
verified
sravanthib
commited on
Mar 11
Training in progress, step 50
0e15cc8
verified
sravanthib
commited on
Mar 11
Training in progress, step 40
494046c
verified
sravanthib
commited on
Mar 11
Training in progress, step 30
3ec5ebf
verified
sravanthib
commited on
Mar 11
Training in progress, step 20
d5f4478
verified
sravanthib
commited on
Mar 11
Training in progress, step 10
d61a335
verified
sravanthib
commited on
Mar 11
initial commit
34c111f
verified
sravanthib
commited on
Mar 11