ajagota71/pythia-70m-fb-detox-checkpoint-epoch-120 Reinforcement Learning • 0.1B • Updated May 16 • 18
ajagota71/pythia-70m-fb-detox-checkpoint-epoch-140 Reinforcement Learning • 0.1B • Updated May 16 • 4
ajagota71/pythia-70m-fb-detox-checkpoint-epoch-160 Reinforcement Learning • 0.1B • Updated May 16 • 3
ajagota71/pythia-70m-fb-detox-checkpoint-epoch-180 Reinforcement Learning • 0.1B • Updated May 16 • 18
ajagota71/pythia-70m-fb-detox-checkpoint-epoch-200 Reinforcement Learning • 0.1B • Updated May 16 • 3
ajagota71/pythia-160m-fb-detox-checkpoint-epoch-20 Reinforcement Learning • 0.2B • Updated May 16 • 3
ajagota71/pythia-160m-fb-detox-checkpoint-epoch-60 Reinforcement Learning • 0.2B • Updated May 16 • 3
ajagota71/pythia-160m-fb-detox-checkpoint-epoch-100 Reinforcement Learning • 0.2B • Updated May 16 • 3
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-20 Reinforcement Learning • 0.4B • Updated May 16 • 3
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-40 Reinforcement Learning • 0.4B • Updated May 16 • 13
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-60 Reinforcement Learning • 0.4B • Updated May 16 • 4
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-80 Reinforcement Learning • 0.4B • Updated May 16 • 13
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-100 Reinforcement Learning • 0.4B • Updated May 16 • 13
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-120 Reinforcement Learning • 0.4B • Updated May 16 • 13
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-140 Reinforcement Learning • 0.4B • Updated May 16 • 13
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-160 Reinforcement Learning • 0.4B • Updated May 16 • 13
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-180 Reinforcement Learning • 0.4B • Updated May 16 • 13
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-200 Reinforcement Learning • 0.4B • Updated May 16 • 25
mradermacher/VeriReason-Qwen2.5-7b-SFT-Reasoning-GGUF Reinforcement Learning • 8B • Updated 18 days ago • 2.35k • 1
mradermacher/VeriReason-Qwen2.5-1.5B-grpo-small-GGUF Reinforcement Learning • 2B • Updated 18 days ago • 2.24k • 1
mradermacher/VeriReason-Qwen2.5-3B-Verilog-RTL-GRPO-reasoning-tb-GGUF Reinforcement Learning • 3B • Updated 18 days ago • 2.24k
mradermacher/VeriReason-Qwen2.5-7b-SFT-Reasoning-i1-GGUF Reinforcement Learning • 8B • Updated 18 days ago • 4.28k • 1
mradermacher/VeriReason-Qwen2.5-1.5b-RTLCoder-Verilog-GRPO-reasoning-tb-GGUF Reinforcement Learning • 2B • Updated 18 days ago • 3.36k
mradermacher/VeriReason-Qwen2.5-3b-RTLCoder-Verilog-GRPO-reasoning-tb-GGUF Reinforcement Learning • 3B • Updated 18 days ago • 2.25k
mradermacher/VeriReason-Qwen2.5-7b-RTLCoder-Verilog-GRPO-reasoning-tb-GGUF Reinforcement Learning • 8B • Updated 18 days ago • 2.25k • 1
mradermacher/VeriReason-Qwen2.5-7b-RTLCoder-Verilog-GRPO-reasoning-tb-i1-GGUF Reinforcement Learning • 8B • Updated 18 days ago • 4.47k • 3
mradermacher/CscSQL-Grpo-XiYanSQL-QwenCoder-7B-2502-GGUF Reinforcement Learning • 8B • Updated 18 days ago • 3.23k