agurung/flawed-fictions-qwen3-4b-lengthpenalty Reinforcement Learning • 4B • Updated about 8 hours ago • 9
agurung/flawed-fictions-qwen25-7b-lengthpenalty-litereason Reinforcement Learning • 8B • Updated 3 days ago • 75
agurung/flawed-fictions-qwen25-7b-lengthpenalty Reinforcement Learning • 8B • Updated 4 days ago • 175