Safety Tax: Safety Alignment Makes Your Large Reasoning Models Less Reasonable Paper • 2503.00555 • Published Mar 1, 2025 • 1
Mitigating Safety Tax via Distribution-Grounded Refinement in Large Reasoning Models Paper • 2602.02136 • Published 10 days ago • 7
DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation Paper • 2601.09688 • Published 29 days ago • 126
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs Paper • 2508.16153 • Published Aug 22, 2025 • 160
UltraHorizon: Benchmarking Agent Capabilities in Ultra Long-Horizon Scenarios Paper • 2509.21766 • Published Sep 26, 2025 • 24