Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning Paper • 2509.22601 • Published 25 days ago • 29
Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published 15 days ago • 421
WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for Open-Ended Deep Research Paper • 2509.13312 • Published Sep 16 • 104