Spend Less, Reason Better: Budget-Aware Value Tree Search for LLM Agents
Abstract
A budget-aware value tree framework enables efficient multi-hop reasoning in language models by dynamically balancing exploration and exploitation based on remaining computational resources.
Test-time scaling has become a dominant paradigm for improving LLM agent reliability, yet current approaches treat compute as an abundant resource, allowing agents to exhaust token and tool budgets on redundant steps or dead-end trajectories. Existing budget-aware methods either require expensive fine-tuning or rely on coarse, trajectory-level heuristics that cannot intervene mid-execution. We propose the Budget-Aware Value Tree (BAVT), a training-free inference-time framework that models multi-hop reasoning as a dynamic search tree guided by step-level value estimation within a single LLM backbone. Another key innovation is a budget-conditioned node selection mechanism that uses the remaining resource ratio as a natural scaling exponent over node values, providing a principled, parameter-free transition from broad exploration to greedy exploitation as the budget depletes. To combat the well-known overconfidence of LLM self-evaluation, BAVT employs a residual value predictor that scores relative progress rather than absolute state quality, enabling reliable pruning of uninformative or redundant tool calls. We further provide a theoretical convergence guarantee, proving that BAVT reaches a terminal answer with probability at least 1-ε under an explicit finite budget bound. Extensive evaluations on four multi-hop QA benchmarks across two model families demonstrate that BAVT consistently outperforms parallel sampling baselines. Most notably, BAVT under strict low-budget constraints surpasses baseline performance at 4times the resource allocation, establishing that intelligent budget management fundamentally outperforms brute-force compute scaling.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Budget-Aware Agentic Routing via Boundary-Guided Training (2026)
- Budget-Constrained Agentic Large Language Models: Intention-Based Planning for Costly Tool Use (2026)
- Aligning Tree-Search Policies with Fixed Token Budgets in Test-Time Scaling of LLMs (2026)
- Plan-MCTS: Plan Exploration for Action Exploitation in Web Navigation (2026)
- Policy of Thoughts: Scaling LLM Reasoning via Test-time Policy Evolution (2026)
- Spark: Strategic Policy-Aware Exploration via Dynamic Branching for Long-Horizon Agentic Learning (2026)
- Can David Beat Goliath? On Multi-Hop Reasoning with Resource-Constrained Agents (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper