Ai Agents Machine Learning Ai Safety

[2602.19008] Capable but Unreliable: Canonical Path Deviation as a Causal Mechanism of Agent Failure in Long-Horizon Tasks

arXiv - Machine Learning February 24, 2026 4 min read Article

Summary

This article explores the reliability failures of language agents in long-horizon tasks, attributing these failures to deviations from canonical solution paths rather than a lack of capability.

Why It Matters

Understanding the causal mechanisms behind agent failures is crucial for improving AI reliability. This research highlights that enhancing agent performance requires more than just scaling capabilities; it necessitates monitoring adherence to established solution paths.

Key Takeaways

Agent failures in long-horizon tasks are often due to stochastic deviations from canonical paths.
Successful task completion is significantly correlated with adherence to these canonical paths.
A monitoring intervention can improve success rates by restarting low-performing runs based on adherence metrics.

Computer Science > Computation and Language arXiv:2602.19008 (cs) [Submitted on 22 Feb 2026] Title:Capable but Unreliable: Canonical Path Deviation as a Causal Mechanism of Agent Failure in Long-Horizon Tasks Authors:Wilson Y. Lee View a PDF of the paper titled Capable but Unreliable: Canonical Path Deviation as a Causal Mechanism of Agent Failure in Long-Horizon Tasks, by Wilson Y. Lee View PDF HTML (experimental) Abstract:Why do language agents fail on tasks they are capable of solving? We argue that many such failures are reliability failures caused by stochastic drift from a task's latent solution structure, not capability failures. Every well-defined tool-use task imposes a canonical solution path (i.e., a convergent set of tool invocations shared across successful runs) and agent success depends critically on whether a trajectory stays within this path's operating envelope. We establish this causally using a natural experiment that holds model capability and task difficulty fixed by construction. We analyze trajectories from the Toolathlon benchmark: 22 frontier models each attempt 108 real-world tool-use tasks across 3 independent runs, yielding 515 model$\times$task units where the same model succeeds on some runs and fails on others due to LLM sampling stochasticity alone. Within these units, successful runs adhere significantly more closely to the canonical solution path than failed runs ($+$0.060 Jaccard, $p<0.0001$, $n=488$ units, 95% CI [+0.043, +0.077]). This...

Read Original Article

[2602.19008] Capable but Unreliable: Canonical Path Deviation as a Causal Mechanism of Agent Failure in Long-Horizon Tasks

Summary

Why It Matters

Key Takeaways

Related Articles

Considering NeurIPS submission [D]

Agent frameworks waste ~350,000+ tokens per session resending static files. 95% reduction benchmarked.

OpenClaw gives users yet another reason to be freaked out about security - Ars Technica

What happens when you let AI agents run a sitcom 24/7 with zero human involvement

No comments

Stay updated with AI News