Leveraging Verifier-Based Reinforcement Learning in Image Editing Paper • 2604.27505 • Published 5 days ago • 29
Offline Evaluation Measures of Fairness in Recommender Systems Paper • 2604.25032 • Published 8 days ago • 1
RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time Paper • 2604.11626 • Published 22 days ago • 101
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 500
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published Apr 3 • 626
Sangsang/grpo_Qwen3-4B_bs16_g16_mb128_lr1e-6_b1e-3_clip0p2_temp0p7_ep30 Text Generation • Updated 30 days ago • 8
CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence Paper • 2603.28032 • Published Mar 30 • 341
PRBench: End-to-end Paper Reproduction in Physics Research Paper • 2603.27646 • Published Mar 29 • 29
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization Paper • 2603.19835 • Published Mar 20 • 350
Lost in Stories: Consistency Bugs in Long Story Generation by LLMs Paper • 2603.05890 • Published Mar 6 • 93