RAD-2: Scaling Reinforcement Learning in a Generator-Discriminator Framework Paper • 2604.15308 • Published 3 days ago • 25
HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds Paper • 2604.14268 • Published 4 days ago • 80
From P(y|x) to P(y): Investigating Reinforcement Learning in Pre-train Space Paper • 2604.14142 • Published 4 days ago • 24
Seedance 2.0: Advancing Video Generation for World Complexity Paper • 2604.14148 • Published 4 days ago • 136
KnowRL: Boosting LLM Reasoning via Reinforcement Learning with Minimal-Sufficient Knowledge Guidance Paper • 2604.12627 • Published 5 days ago • 96
SPPO: Sequence-Level PPO for Long-Horizon Reasoning Tasks Paper • 2604.08865 • Published 9 days ago • 29
From Word to World: Can Large Language Models be Implicit Text-based World Models? Paper • 2512.18832 • Published Dec 21, 2025 • 15
SPPO: Sequence-Level PPO for Long-Horizon Reasoning Tasks Paper • 2604.08865 • Published 9 days ago • 29