Attention as a Compass: Efficient Exploration for Process-Supervised RL in Reasoning Models Paper • 2509.26628 • Published 20 days ago • 12 • 3