[2605.05262] Maximizing Rollout Informativeness under a Fixed Budget: A Submodular View of Tree Search for Tool-Use Agentic Reinforcement Learning

[2605.05262] Maximizing Rollout Informativeness under a Fixed Budget: A Submodular View of Tree Search for Tool-Use Agentic Reinforcement Learning

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2605.05262: Maximizing Rollout Informativeness under a Fixed Budget: A Submodular View of Tree Search for Tool-Use Agentic Reinforcement Learning

Statistics > Machine Learning arXiv:2605.05262 (stat) [Submitted on 6 May 2026] Title:Maximizing Rollout Informativeness under a Fixed Budget: A Submodular View of Tree Search for Tool-Use Agentic Reinforcement Learning Authors:Yuelin Hu, Zhenbo Yu, Zhengxue Cheng, Wei Liu, Li Song View a PDF of the paper titled Maximizing Rollout Informativeness under a Fixed Budget: A Submodular View of Tree Search for Tool-Use Agentic Reinforcement Learning, by Yuelin Hu and 4 other authors View PDF HTML (experimental) Abstract:We formalize Rollout Informativeness under a Fixed Budget (RIFB) as the expected non-vanishing policy-gradient mass that a tool-use rollout set injects into Group Relative Policy Optimization (GRPO). We prove that any budget-agnostic independent sampler suffers a collapse rate bounded away from zero for hard prompts regardless of the budget. Motivated by this, we recast intermediate state selection as a monotone submodular maximization problem, where a greedy one-step selector enjoys a 1 minus 1/e approximation guarantee. Our Uncertainty-aware Upper Confidence Bound (UUCB) terms arise as closed-form marginal gains of this objective. This turns the token-level entropy bonus from an empirical trick into an analytic consequence of the formulation. We present InfoTree, a training-time tree-search framework coupling UUCB with a learned Adaptive Budget Allocator (ABA) and an asynchronous Speculative Expansion scheme. ABA rescues prompts whose initial tree is wasted on ...

Originally published on May 08, 2026. Curated by AI News.

Related Articles

Ai Agents

AWS just gave AI agents their own wallets. Your agent can now pay for itself.

This dropped 4 days ago and I haven't seen enough people talking about it. AWS launched Amazon Bedrock AgentCore Payments in partnership ...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

Are Enterprises Using AI in the Wrong Places?

Most enterprise AI discussions still revolve around one question: But I’m starting to think that may be the wrong question entirely. The ...

Reddit - Artificial Intelligence · 1 min ·
The barista is human but an AI agent runs this experimental Swedish cafe
Ai Agents

The barista is human but an AI agent runs this experimental Swedish cafe

AI Tools & Products · 8 min ·
Machine Learning

I gave a local AI agent system file access and a mechanical "suffering" metric. Scaling the model changed its behavior entirely

I’ve been obsessed with autonomous agents lately, but it got tiring when they keep hitting walls because they didn't have the right capab...

Reddit - Artificial Intelligence · 1 min ·
More in Ai Agents: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime