Universal Deep Research: Bring Your Own Model and Strategy Paper • 2509.00244 • Published Aug 29 • 13
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey Paper • 2509.02547 • Published Sep 2 • 220
Efficient Multi-modal Large Language Models via Progressive Consistency Distillation Paper • 2510.00515 • Published Oct 1 • 39
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search Paper • 2509.25454 • Published Sep 29 • 136
Demystifying Reinforcement Learning in Agentic Reasoning Paper • 2510.11701 • Published 22 days ago • 31