[2602.17046] Dynamic System Instructions and Tool Exposure for Efficient Agentic LLMs

[2602.17046] Dynamic System Instructions and Tool Exposure for Efficient Agentic LLMs

arXiv - AI 3 min read Article

Summary

The paper presents Instruction-Tool Retrieval (ITR), a method that optimizes the operation of Large Language Model (LLM) agents by dynamically retrieving minimal system instructions and tool subsets, significantly reducing costs and improving efficiency.

Why It Matters

As LLMs become integral in various applications, optimizing their performance is crucial. The proposed ITR method addresses common inefficiencies in LLM operations, potentially enhancing their usability in long-running autonomous tasks, which is vital for advancing AI capabilities.

Key Takeaways

  • ITR reduces per-step context tokens by 95%, enhancing efficiency.
  • Improves correct tool routing by 32%, minimizing selection errors.
  • Cuts end-to-end episode cost by 70% compared to traditional methods.
  • Enables LLM agents to perform 2-20x more loops within context limits.
  • Provides operational guidance for practical deployment of ITR.

Computer Science > Artificial Intelligence arXiv:2602.17046 (cs) [Submitted on 1 Dec 2025] Title:Dynamic System Instructions and Tool Exposure for Efficient Agentic LLMs Authors:Uria Franko View a PDF of the paper titled Dynamic System Instructions and Tool Exposure for Efficient Agentic LLMs, by Uria Franko View PDF HTML (experimental) Abstract:Large Language Model (LLM) agents often run for many steps while re-ingesting long system instructions and large tool catalogs each turn. This increases cost, agent derailment probability, latency, and tool-selection errors. We propose Instruction-Tool Retrieval (ITR), a RAG variant that retrieves, per step, only the minimal system-prompt fragments and the smallest necessary subset of tools. ITR composes a dynamic runtime system prompt and exposes a narrowed toolset with confidence-gated fallbacks. Using a controlled benchmark with internally consistent numbers, ITR reduces per-step context tokens by 95%, improves correct tool routing by 32% relative, and cuts end-to-end episode cost by 70% versus a monolithic baseline. These savings enable agents to run 2-20x more loops within context limits. Savings compound with the number of agent steps, making ITR particularly valuable for long-running autonomous agents. We detail the method, evaluation protocol, ablations, and operational guidance for practical deployment. Subjects: Artificial Intelligence (cs.AI) Cite as: arXiv:2602.17046 [cs.AI]   (or arXiv:2602.17046v1 [cs.AI] for this ver...

Related Articles

Llms

Nvidia goes all-in on AI agents while Anthropic pulls the plug

TLDR: Nvidia is partnering with 17 major companies to build a platform specifically for enterprise AI agents, basically trying to become ...

Reddit - Artificial Intelligence · 1 min ·
Anthropic says Claude Code subscribers will need to pay extra for OpenClaw usage | TechCrunch
Llms

Anthropic says Claude Code subscribers will need to pay extra for OpenClaw usage | TechCrunch

It’s about to become more expensive for Claude Code subscribers to use Anthropic’s coding assistant with OpenClaw and other third-party t...

TechCrunch - AI · 4 min ·
Llms

I am seeing Claude everywhere

Every single Instagram reel or TikTok I scroll i see people mentioning Claude and glazing it like it’s some kind of master tool that’s be...

Reddit - Artificial Intelligence · 1 min ·
Llms

Claude Opus 4.6 API at 40% below Anthropic pricing – try free before you pay anything

Hey everyone I've set up a self-hosted API gateway using [New-API](QuantumNous/new-ap) to manage and distribute Claude Opus 4.6 access ac...

Reddit - Artificial Intelligence · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime