[2602.17046] Dynamic System Instructions and Tool Exposure for Efficient Agentic LLMs
Summary
The paper presents Instruction-Tool Retrieval (ITR), a method that optimizes the operation of Large Language Model (LLM) agents by dynamically retrieving minimal system instructions and tool subsets, significantly reducing costs and improving efficiency.
Why It Matters
As LLMs become integral in various applications, optimizing their performance is crucial. The proposed ITR method addresses common inefficiencies in LLM operations, potentially enhancing their usability in long-running autonomous tasks, which is vital for advancing AI capabilities.
Key Takeaways
- ITR reduces per-step context tokens by 95%, enhancing efficiency.
- Improves correct tool routing by 32%, minimizing selection errors.
- Cuts end-to-end episode cost by 70% compared to traditional methods.
- Enables LLM agents to perform 2-20x more loops within context limits.
- Provides operational guidance for practical deployment of ITR.
Computer Science > Artificial Intelligence arXiv:2602.17046 (cs) [Submitted on 1 Dec 2025] Title:Dynamic System Instructions and Tool Exposure for Efficient Agentic LLMs Authors:Uria Franko View a PDF of the paper titled Dynamic System Instructions and Tool Exposure for Efficient Agentic LLMs, by Uria Franko View PDF HTML (experimental) Abstract:Large Language Model (LLM) agents often run for many steps while re-ingesting long system instructions and large tool catalogs each turn. This increases cost, agent derailment probability, latency, and tool-selection errors. We propose Instruction-Tool Retrieval (ITR), a RAG variant that retrieves, per step, only the minimal system-prompt fragments and the smallest necessary subset of tools. ITR composes a dynamic runtime system prompt and exposes a narrowed toolset with confidence-gated fallbacks. Using a controlled benchmark with internally consistent numbers, ITR reduces per-step context tokens by 95%, improves correct tool routing by 32% relative, and cuts end-to-end episode cost by 70% versus a monolithic baseline. These savings enable agents to run 2-20x more loops within context limits. Savings compound with the number of agent steps, making ITR particularly valuable for long-running autonomous agents. We detail the method, evaluation protocol, ablations, and operational guidance for practical deployment. Subjects: Artificial Intelligence (cs.AI) Cite as: arXiv:2602.17046 [cs.AI] (or arXiv:2602.17046v1 [cs.AI] for this ver...