Marco DeepResearch: Unlocking Efficient Deep Research Agents via Verification-Centric Design
Abstract
A verification-centric framework for deep research agents improves performance on complex benchmarks by incorporating error checking at multiple stages of development and inference.
Deep research agents autonomously conduct open-ended investigations, integrating complex information retrieval with multi-step reasoning across diverse sources to solve real-world problems. To sustain this capability on long-horizon tasks, reliable verification is critical during both training and inference. A major bottleneck in existing paradigms stems from the lack of explicit verification mechanisms in QA data synthesis, trajectory construction, and test-time scaling. Errors introduced at each stage propagate downstream and degrade the overall agent performance. To address this, we present Marco DeepResearch, a deep research agent optimized with a verification-centric framework design at three levels: (1)~QA Data Synthesis: We introduce verification mechanisms to graph-based and agent-based QA synthesis to control question difficulty while ensuring answers are unique and correct; (2)~Trajectory Construction: We design a verification-driven trajectory synthesis method that injects explicit verification patterns into training trajectories; and (3)~Test-time scaling: We use Marco DeepResearch itself as a verifier at inference time and effectively improve performance on challenging questions. Extensive experimental results demonstrate that our proposed Marco DeepResearch agent significantly outperforms 8B-scale deep research agents on most challenging benchmarks, such as BrowseComp and BrowseComp-ZH. Crucially, under a maximum budget of 600 tool calls, Marco DeepResearch even surpasses or approaches several 30B-scale agents, like Tongyi DeepResearch-30B.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Search More, Think Less: Rethinking Long-Horizon Agentic Search for Efficiency and Generalization (2026)
- OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis (2026)
- DeepResearch-9K: A Challenging Benchmark Dataset of Deep-Research Agent (2026)
- OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data (2026)
- RE-TRAC: REcursive TRAjectory Compression for Deep Search Agents (2026)
- SynPlanResearch-R1: Encouraging Tool Exploration for Deep Research with Synthetic Plans (2026)
- ReThinker: Scientific Reasoning by Rethinking with Guided Reflection and Confidence Control (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Get this paper in your agent:
hf papers read 2603.28376 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper