Megacorporations are telling businesses that their AI offerings are good enough to run vital company functions. The problem is, those AIs are still screwing up, and frequently in ways humans wouldn’t screw up. That’s what Amazon found out when they tried to eat their own dogfood, putting their AI in charge of Amazon Web Services. It didn’t go well.
Are AI tools reliable enough to be used at in commercial settings? If so, should they be given “autonomy” to make decisions? These are the questions being raised after at least two internet outages at Amazon’s cloud division were allegedly caused by blundering AI agents, according to new reporting from the Financial Times.
In one incident in December, engineers at Amazon Web Services allowed its in-house Kiro “agentic” coding tool to make changes that sparked a 13-hour disruption, according to four sources familiar with the matter. The AI, ill-fatedly, had decided to “delete and recreate the environment,” the sources said.
When something is “in the cloud,” that means it’s sitting on someone else’s computer. More specifically, it’s probably running as a containerized instance on any of a number of other CPU and storage pools being run under a hypervisor to scale up or scale down resources as demand requires. This allows efficient use of those resources, and it’s made AWS Amazon’s most profitable business. And most of the time AWS works pretty well.
Amazon employees claimed that this was not the first service disruption involving an AI tool.
“We’ve already seen at least two production outages [in the past few months],” one senior AWS employee told the FT. “The engineers let the AI [agent] resolve an issue without intervention. The outages were small but entirely foreseeable.”
AWS launched its in-house coding assistant, Kiro, in July. The company describes the tool as an “autonomous” agent that can help deliver projects “from concept to production.” Another AI coding assistant developed by Amazon, described as an AI assistant, was involved in the earlier outage.
The employees said the AI tools were treated as an extension of an operator and given operator-level permissions. In both of the outages, the engineers didn’t require a second person’s approval before finalizing the changes, going against typical protocol.
In a statement to the FT, Amazon claimed the outage was an “extremely limited event” that affected only one service in parts of China.
I’m not sure I was aware AWS operated in China, but I guess I’m not surprised. Is it too much to ask that the China data centers are adequately segmented and firewalled from the American data centers?
Moreover, it was a “coincidence that AI tools were involved” and that “the same issue could occur with any developer tool or manual action,” it said.
Except usually code changes are usually run through rigorous testing in a continuous integration/continuous deployment pipeline, and then deployed to a test server for performance and regression testing. It’s not clear that was done here.
It also claimed that its Kiro AI “requests authorisation before taking any action,” but that the engineer involved in the December outage had more permissions than usual, calling this a “user access control issue, not an AI autonomy issue.”
“In both instances, this was user error, not AI error,” Amazon insisted.
True, in the sense that an Amazon engineer evidently allowed an AI to alter production code.
The company also claimed that it had not seen evidence that mistakes were more common with AI tools. To which we retort: is Amazon living under a rock? While AI and its foray into commercial applications remain nascent, there’s no shortage of evidence showing that the tools are prone to malfunctioning. Their proclivity for producing hallucinations, or instances in which they fabricate facts, is well documented. So are their weak guardrails. Even some of Amazon’s own employees are reluctant to use AI tools because of the risk of error, they told the FT.
Veteran programmers are finding that AI coding assistants consistently spit out botched code, with several studies showing that the frequent double and triple-checking the questionable outputs require in reality slow down software engineers, even though the AI, on a surface level, may be producing the code faster. The rise of “vibe coding” with AI has resulted in numerous blunders in which an agentic AI makes decisions that its owners didn’t intend.
Of course, it would not be much of a ringing endorsement if tech companies weren’t using the AI tools they claim will supercharge productivity in their own operations, and they’ve been more than willing to get high on their own supplies. Both Microsoft and Google boast that over a quarter of their code is now written with AI. Engineers at Anthropic and OpenAI have suggested that nearly 100 percent of their code is AI written.
This does not inspire me with confidence. Let’s pull out the relevant XKCD comic again:

The only reason the modern technological world works is that someone, somewhere understands at a deep level how each of those boxes work, and can fix it if something goes wrong. And for Open Source software, the source code for those boxes is available somewhere other people can look at it and understand it.
When you start replacing the code in some of those boxes with AI-generated code, you start losing the knowledge of how everything works and why. Maybe the AI is producing clear, well-documented code, but you can’t count on it. And the AI doesn’t understand code the way a human does, because and AI doesn’t understand anything in the way we mean it, it’s running on artificially evolved heuristics that have performed well designing things to pass documented test cases, but which have zero frameworks for handling unanticipated exceptions. And when it breaks, there’s no guarantee a human will understand how and why it broke.
And given competitive time-to-market pressures, you can be sure companies will increasingly ship AI code without adequate safeguards or sufficient testing because their service is down hard and the latest code fixes the last AI bug, so they’ll end up rolling the fix straight to production, and something in the fix will be an even more disasterous bug none of the test cases caught and everything will come tumbling down.
And if you do that with enough of those little boxes of digital infrastructure, the entire underpinings of modern online life may come tumbling down with it. And you can’t find people to fix it because you laid them off last year and replaced them with AI.
The problem with eating your own dog food is that sometimes it can be lousy, especially if you have no idea what went into it…
The Bots Are Getting Smarter. (And Leon is Getting LARGER!)
Sunday, November 21st, 2010(Sigh) I may have mentioned that I have to clear out a fair amount of comments spam that Akismet catches every day.
Well, whoever programs bot seemed to have figured out at least that this is a political blog, as I’ve been getting a lot of generic anti-Charlie Rangel comments on threads that have nothing to do with Charlie Rangel. Close, but no cigar bot guys. (Or, to put it in your own bot language, “Your cigar with excellent closeness you fail to reach.”)
But I can see a day when bot spam may start to mimic at least a semi-competent troll. As usual, there’s an XKCD for that:
Tags:Administrative, bot, Charlie Rangel, comment spam, Site Administration, spam, XKCD
Posted in Uncategorized | No Comments »