Beyond Token-level Supervision: Unlocking the Potential of Decoding-based Regression via Reinforcement Learning Paper • 2512.06533 • Published Dec 6, 2025 • 6
P1: Mastering Physics Olympiads with Reinforcement Learning Paper • 2511.13612 • Published Nov 17, 2025 • 134