Commit History

Add custom sampler, train data loader and GRPO style train loop for ReTool_trainer
c710786
verified

bird-of-paradise commited on

adding test suite -- first commit
0690c9f
verified

bird-of-paradise commited on

replace `model.generate` with custom generation function to optimize kv_cache
a0dec77
verified

bird-of-paradise commited on

first commit --curriculum callback
cfa2a65
verified

bird-of-paradise commited on

Add reward functions and registry
2fc6f4d
verified

bird-of-paradise commited on

Use weighted list reward functions
e9196fe
verified

bird-of-paradise commited on