[2602.17734] Five Fatal Assumptions: Why T-Shirt Sizing Systematically Fails for AI Projects

[2602.17734] Five Fatal Assumptions: Why T-Shirt Sizing Systematically Fails for AI Projects

arXiv - AI 4 min read Article

Summary

This paper critiques the T-shirt sizing estimation method in AI projects, highlighting five key assumptions that often lead to failure and proposing an alternative approach called Checkpoint Sizing.

Why It Matters

As AI projects become increasingly complex, traditional estimation methods like T-shirt sizing can mislead teams, resulting in project delays and inefficiencies. Understanding these pitfalls is crucial for engineering managers and product owners to improve planning and execution in AI initiatives.

Key Takeaways

  • T-shirt sizing fails in AI due to non-linear performance and complex interactions.
  • Five assumptions of T-shirt sizing are often invalid in AI contexts.
  • Checkpoint Sizing offers a more iterative and human-centric approach.
  • Engineering teams should reassess scope and feasibility throughout development.
  • Awareness of these pitfalls can lead to more successful AI project outcomes.

Computer Science > Software Engineering arXiv:2602.17734 (cs) [Submitted on 18 Feb 2026] Title:Five Fatal Assumptions: Why T-Shirt Sizing Systematically Fails for AI Projects Authors:Raja Soundaramourty, Ozkan Kilic, Ramu Chenchaiah View a PDF of the paper titled Five Fatal Assumptions: Why T-Shirt Sizing Systematically Fails for AI Projects, by Raja Soundaramourty and 2 other authors View PDF HTML (experimental) Abstract:Agile estimation techniques, particularly T-shirt sizing, are widely used in software development for their simplicity and utility in scoping work. However, when we apply these methods to artificial intelligence initiatives -- especially those involving large language models (LLMs) and multi-agent systems -- the results can be systematically misleading. This paper shares an evidence-backed analysis of five foundational assumptions we often make during T-shirt sizing. While these assumptions usually hold true for traditional software, they tend to fail in AI contexts: (1) linear effort scaling, (2) repeatability from prior experience, (3) effort-duration fungibility, (4) task decomposability, and (5) deterministic completion criteria. Drawing on recent research into multi-agent system failures, scaling principles, and the inherent unreliability of multi-turn conversations, we show how AI development breaks these rules. We see this through non-linear performance jumps, complex interaction surfaces, and "tight coupling" where a small change in data cascades ...

Related Articles

Llms

I stopped using Claude like a chatbot — 7 prompt shifts that reclaimed 10 hours of my week

submitted by /u/ThereWas [link] [comments]

Reddit - Artificial Intelligence · 1 min ·
Llms

What features do you actually want in an AI chatbot that nobody has built yet?

Hey everyone 👋 I'm building a new AI chat app and before I build anything I want to hear from real users first. Current AI tools like Cha...

Reddit - Artificial Intelligence · 1 min ·
Llms

So, what exactly is going on with the Claude usage limits?

I'm extremely new to AI and am building a local agent for fun. I purchased a Claude Pro account because it helped me a lot in the past wh...

Reddit - Artificial Intelligence · 1 min ·
Llms

Why the Reddit Hate of AI?

I just went through a project where a builder wanted to build a really large building on a small lot next door. The project needed 6 vari...

Reddit - Artificial Intelligence · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime