Ai Infrastructure Ai Agents Ai Safety Machine Learning

[2602.20292] Quantifying the Expectation-Realisation Gap for Agentic AI Systems

arXiv - AI February 25, 2026 3 min read Article

Summary

This article examines the expectation-realisation gap in agentic AI systems, revealing discrepancies between anticipated productivity gains and actual outcomes across various domains, including software engineering and clinical documentation.

Why It Matters

Understanding the expectation-realisation gap is crucial for stakeholders in AI deployment, as it highlights the need for realistic assessments of AI capabilities and the importance of integrating human oversight in planning frameworks. This research can inform better decision-making in AI investments and implementations.

Key Takeaways

Agentic AI systems often underperform compared to initial expectations.
In software development, AI tools can slow down processes instead of speeding them up.
Clinical documentation tools may not deliver the promised time savings.
The gap is influenced by integration challenges and measurement mismatches.
Structured planning frameworks are needed to set realistic expectations.

Computer Science > Software Engineering arXiv:2602.20292 (cs) [Submitted on 23 Feb 2026] Title:Quantifying the Expectation-Realisation Gap for Agentic AI Systems Authors:Sebastian Lobentanzer View a PDF of the paper titled Quantifying the Expectation-Realisation Gap for Agentic AI Systems, by Sebastian Lobentanzer View PDF Abstract:Agentic AI systems are deployed with expectations of substantial productivity gains, yet rigorous empirical evidence reveals systematic discrepancies between pre-deployment expectations and post-deployment outcomes. We review controlled trials and independent validations across software engineering, clinical documentation, and clinical decision support to quantify this expectation-realisation gap. In software development, experienced developers expected a 24% speedup from AI tools but were slowed by 19% -- a 43 percentage-point calibration error. In clinical documentation, vendor claims of multi-minute time savings contrast with measured reductions of less than one minute per note, and one widely deployed tool showed no statistically significant effect. In clinical decision support, externally validated performance falls substantially below developer-reported metrics. These shortfalls are driven by workflow integration friction, verification burden, measurement construct mismatches, and systematic heterogeneity in treatment effects. The evidence motivates structured planning frameworks that require explicit, quantified benefit expectations with ...

Read Original Article

[2602.20292] Quantifying the Expectation-Realisation Gap for Agentic AI Systems

Summary

Why It Matters

Key Takeaways

Related Articles

OpenAI’s Fidji Simo Is Taking Medical Leave Amid an Executive Shake-Up | WIRED

[D] Best websites for pytorch/numpy interviews

OpenAI’s AGI boss is taking a leave of absence | The Verge

What happens when you let AI agents run a sitcom 24/7 with zero human involvement

No comments

Stay updated with AI News