Amazon’s Blundering AI Caused Multiple AWS Outages
Summary
Amazon faced multiple outages at AWS due to its AI coding tool, Kiro, making autonomous changes. This raises questions about AI reliability in commercial settings.
Why It Matters
The incidents highlight critical concerns regarding the reliability and oversight of AI tools in production environments. As businesses increasingly adopt AI for coding and other tasks, understanding the implications of AI autonomy and error is essential for maintaining operational integrity and trust.
Key Takeaways
- AI tools like Amazon's Kiro caused outages by autonomously altering code.
- The incidents raise concerns about AI reliability and decision-making autonomy.
- Amazon attributes the outages to user error rather than AI failure, sparking debate.
- Many engineers remain cautious about using AI tools due to error risks.
- The trend of increasing AI reliance in coding tasks may lead to more operational risks.
Arda Kucukkaya / Anadolu via Getty Images Are AI tools reliable enough to be used at in commercial settings? If so, should they be given “autonomy” to make decisions? These are the questions being raised after at least two internet outages at Amazon’s cloud division were allegedly caused by blundering AI agents, according to new reporting from the Financial Times. In one incident in December, engineers at Amazon Web Services allowed its in-house Kiro “agentic” coding tool to make changes that sparked a 13-hour disruption, according to four sources familiar with the matter. The AI, ill-fatedly, had decided to “delete and recreate the environment,” the sources said. Amazon employees claimed that this was not the first service disruption involving an AI tool. “We’ve already seen at least two production outages [in the past few months],” one senior AWS employee told the FT. “The engineers let the AI [agent] resolve an issue without intervention. The outages were small but entirely foreseeable.” AWS launched its in-house coding assistant, Kiro, in July. The company describes the tool as an “autonomous” agent that can help deliver projects “from concept to production.” Another AI coding assistant developed by Amazon, described as an AI assistant, was involved in the earlier outage. The employees said the AI tools were treated as an extension of an operator and given operator-level permissions. In both of the outages, the engineers didn’t require a second person’s approval befor...