Towards Faithful and Controllable Personalization via Critique-Post-Edit Reinforcement Learning Paper • 2510.18849 • Published 4 days ago • 19
A^2FM: An Adaptive Agent Foundation Model for Tool-Aware Hybrid Reasoning Paper • 2510.12838 • Published 12 days ago • 22
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published 12 days ago • 164
ACADREASON: Exploring the Limits of Reasoning Models with Academic Research Problems Paper • 2510.11652 • Published 12 days ago • 26
Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel Execution Paper • 2509.25301 • Published 26 days ago • 17
Towards Personalized Deep Research: Benchmarks and Evaluations Paper • 2509.25106 • Published 26 days ago • 27
Tree Search for LLM Agent Reinforcement Learning Paper • 2509.21240 • Published about 1 month ago • 87
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models Paper • 2508.06471 • Published Aug 8 • 186
Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL Paper • 2508.13167 • Published Aug 6 • 127