Generative Universal Verifier as Multimodal Meta-Reasoner Paper • 2510.13804 • Published 5 days ago • 24
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR Paper • 2508.14029 • Published Aug 19 • 118