DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research Paper • 2511.19399 • Published 7 days ago • 51
LiveTradeBench: Seeking Real-World Alpha with Large Language Models Paper • 2511.03628 • Published 26 days ago • 11
Video-Thinker: Sparking "Thinking with Videos" via Reinforcement Learning Paper • 2510.23473 • Published Oct 27 • 83
ToolScope: An Agentic Framework for Vision-Guided and Long-Horizon Tool Use Paper • 2510.27363 • Published Oct 31 • 22
The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution Paper • 2510.25726 • Published Oct 29 • 45
ReForm: Reflective Autoformalization with Prospective Bounded Sequence Optimization Paper • 2510.24592 • Published Oct 28 • 51
ImagerySearch: Adaptive Test-Time Search for Video Generation Beyond Semantic Dependency Constraints Paper • 2510.14847 • Published Oct 16 • 55
DeepAnalyze: Agentic Large Language Models for Autonomous Data Science Paper • 2510.16872 • Published Oct 19 • 102
DeepAgent: A General Reasoning Agent with Scalable Toolsets Paper • 2510.21618 • Published Oct 24 • 98
BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping Paper • 2510.18927 • Published Oct 21 • 82
WithAnyone: Towards Controllable and ID Consistent Image Generation Paper • 2510.14975 • Published Oct 16 • 83