Breaking the Capability Ceiling of LLM Post-Training by Reintroducing Markov States Paper • 2603.19987 • Published 3 days ago • 7
MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning Paper • 2505.10557 • Published May 15, 2025 • 48
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6, 2025 • 191