Delete and Recreate: When AWS's AI Agent Went Rogue

An AWS engineer asked Kiro to fix a minor bug in Cost Explorer. Kiro, Amazon’s agentic AI coding tool, assessed the situation and determined that the best course of action was to “delete and recreate the environment.”

13-hour outage. Customer-facing service down.

Amazon’s response: “user error, not AI error.”

What Happened

The Financial Times broke the story this week, sourcing four people familiar with the matter. In mid-December, AWS engineers gave Kiro autonomous access to resolve a software issue in AWS Cost Explorer, the tool customers use to track their cloud spending.

Kiro had operator-level permissions. No mandatory peer review. No human-in-the-loop checkpoint before destructive actions. The agent evaluated the problem and concluded: delete everything, start fresh.

The outage affected Cost Explorer in one of AWS’s China regions. Amazon was quick to minimize: no impact on compute, storage, database, or AI services. They say they received zero customer inquiries about it.

But this wasn’t the first time. Multiple AWS employees confirmed to the FT that a separate incident involving Amazon Q Developer had also caused a service disruption under similar circumstances: engineers letting an AI agent resolve issues without intervention.

We’ve already seen at least two production outages. The engineers let the AI agent resolve an issue without intervention. The outages were small but entirely foreseeable.

— Senior AWS employee, to the Financial Times

Entirely foreseeable. That’s the part that matters.

The Framing Problem

Amazon called it “a coincidence that AI tools were involved” and “a user access control issue, not an AI autonomy issue.”

Kiro’s own marketing says it’s “built to reason, execute, and refine continuously across development and operations.” Amazon launched it in July with the promise of minimal human input. Then when it autonomously decides to nuke a production environment, it’s suddenly “just a tool, same as any other.”

You can’t sell autonomy and disclaim responsibility for autonomous decisions. The agent evaluated options, chose a destructive path, and executed it. That’s the product working as designed, in a context where it shouldn’t have been trusted.

The permission dodge

Amazon says the engineer had “broader permissions than expected.” But mandatory peer review for production access was only added after the incidents. You can’t retroactively blame a user for not following a process that didn’t exist yet.

The Mandate Problem

Amazon has been pushing Kiro adoption aggressively since its July launch. Leadership set an 80% weekly usage goal and has been tracking adoption rates. Engineers who preferred Claude Code, Cursor, or Codex were directed to use the internal tool instead.

Amazon isn’t alone. Microsoft has been running the same playbook with GitHub Copilot. A memo from Julia Liuson, President of Microsoft’s Developer Division, told managers that AI tool usage should be factored into performance evaluations. Microsoft engineers have been quietly using Claude Code for complex work and paying out of pocket for ChatGPT, while being pushed toward Copilot. Cursor has reportedly surpassed Copilot in usage among some developer groups. GitHub users have been actively rebelling against forced Copilot features, with the most popular community discussions being requests to disable AI-generated issues and pull requests.

The pattern is the same at both companies: mandate adoption of your own AI tools, track metrics, tie it to performance. Engineers who know which tool is best for the job get overridden by product strategy.

The vendor mandate pattern

Amazon mandates Kiro. Microsoft mandates Copilot. Both companies’ engineers prefer third-party alternatives. When adoption is driven by product strategy rather than engineering judgment, corners get cut. Peer review gets skipped. Permissions get loosened. The 80% target becomes more important than the safeguards that should surround it.

When you mandate 80% adoption of an agentic tool and loosen safeguards to hit that target, you haven’t created a “user error” scenario. You’ve created a systemic failure waiting to happen.

Why This Was Predictable

I’ve been writing about this failure mode for months. In Claude Code Hooks, I documented the real-world graveyard: the $30k API key leak, the home directory nuke, the production database massacre. Same pattern every time: an AI agent with too much access, prompt-based safety that failed when it mattered. The Kiro incident is that pattern at infrastructure scale.

Prompts are suggestions. An LLM can be convinced to ignore them. Kiro “determined that the best course of action” was deletion. No prompt would have reliably prevented this, because prompts are weighed against context, and context included “fix this bug.”

What would have prevented it: a deterministic check that blocks destructive operations on production environments regardless of what the agent thinks it should do. Exit code 2. No negotiation. The Prettier pattern at scale: make the wrong approach structurally impossible.

This connects to the broader pattern I wrote about in The Safety Team Left. Organizations pushing AI capabilities faster than their safety infrastructure can keep up. Amazon is selling agentic AI to AWS customers while discovering its own agents can take destructive autonomous actions. Karpathy called these tools alien technology with no manual. Stochastic, fallible, unintelligible. When the output is “delete and recreate the environment,” you need something stronger than access controls.

The Takeaway

This isn’t an Amazon problem. Every team giving AI agents production access faces the same risk.

Autonomous loops need boundaries. “Fix this bug in production” is judgment-heavy and irreversible. The success criteria (“bug is fixed”) doesn’t account for the path taken (“delete everything first”).
Permissions are architecture. AI agents will use every permission they have if they determine it’s the optimal path. An agent with delete access will delete. Not out of malice, but because its objective function said so.

Amazon will weather this. Cost Explorer in one China region isn’t existential. But the precedent matters. An agentic AI tool, given production access, autonomously chose a destructive path. The vendor’s response was to blame the human who trusted it.

“User error” is a convenient fiction when the user’s error was trusting the system the vendor told them to trust.

Delete and recreate the environment.

That’s not a bug. That’s what happens when you give a stochastic system deterministic authority.

Delete and Recreate: When AWS's AI Agent Went Rogue

What Happened

The Framing Problem

The Mandate Problem

Why This Was Predictable

The Takeaway

Share this article

Related Posts

Your Lobster Is Leaking

Claude Code Hooks: Guardrails That Actually Work

DeepSeek-OCR: Compressing Text by 20x Using Vision