[2603.01548] Graph-Based Self-Healing Tool Routing for Cost-Efficient LLM Agents
About this article
Abstract page for arXiv paper 2603.01548: Graph-Based Self-Healing Tool Routing for Cost-Efficient LLM Agents
Computer Science > Artificial Intelligence arXiv:2603.01548 (cs) [Submitted on 2 Mar 2026] Title:Graph-Based Self-Healing Tool Routing for Cost-Efficient LLM Agents Authors:Neeraj Bholani View a PDF of the paper titled Graph-Based Self-Healing Tool Routing for Cost-Efficient LLM Agents, by Neeraj Bholani View PDF Abstract:Tool-using LLM agents face a reliability-cost tradeoff: routing every decision through the LLM improves correctness but incurs high latency and inference cost, while pre-coded workflow graphs reduce cost but become brittle under unanticipated compound tool failures. We present Self-Healing Router, a fault-tolerant orchestration architecture that treats most agent control-flow decisions as routing rather than reasoning. The system combines (i) parallel health monitors that assign priority scores to runtime conditions such as tool outages and risk signals, and (ii) a cost-weighted tool graph where Dijkstra's algorithm performs deterministic shortest-path routing. When a tool fails mid-execution, its edges are reweighted to infinity and the path is recomputed -- yielding automatic recovery without invoking the LLM. The LLM is reserved exclusively for cases where no feasible path exists, enabling goal demotion or escalation. Prior graph-based tool-use systems (ControlLLM, ToolNet, NaviAgent) focus on tool selection and planning; our contribution is runtime fault tolerance with deterministic recovery and binary observability -- every failure is either a logged...