I woke up to a $100 extra usage charge last week. I’d been running Sonnet 1M for explore agents instead of Haiku. Seemed reasonable. I’m on a Max plan. The model selector doesn’t flag it as extra billing. Nothing warns you.
But somewhere in the pipeline, 1M context agent usage is burning through extra usage credits silently. The charges don’t appear in real-time. They show up a day later, after the damage is done.
I’m not alone. April 2026 has been the month Anthropic’s trust account went negative.
The Billing Trap
The 1M context billing issue has been a moving target. Earlier versions of Claude Code explicitly showed “Billed as extra usage” for 1M context models, which was eventually fixed in the UI. But the underlying billing behaviour is murkier. Anthropic’s March announcement said 1M context was “included in Max, Team, and Enterprise plans.” The model selector now shows no extra usage warning. Yet users are still reporting unexpected charges when running 1M context agents at scale.
The damage reports stack up:
- The agent trap: Running Sonnet 1M for subagents (explore, research, etc.) quietly racks up extra usage. The UI says nothing. Your bill says everything.
- Dashboard mismatch: Claude Code reports 100% usage while the web dashboard shows 73%. The difference gets billed as extra usage. One user got charged $53 they never authorized.
- The $600 overrun: A developer overran their Max subscription by “an hour or two” finishing a project. Nearly $600 in API charges.
- The env var gotcha: If you have
ANTHROPIC_API_KEYset in your environment, Claude Code silently bills to API instead of your subscription. Many developers discovered this after the fact. - Credit clawback: One user had their $200 extra usage credit voided after using Anthropic’s own refund link.
If you’re on a Max plan running 1M context agents, check your extra usage charges immediately. The model selector no longer warns you, but that doesn’t mean the usage is included.
The Token Drain Crisis
The billing issues landed on top of the March token drain crisis, which never fully resolved.
Max subscribers paying $200/month were exhausting their quotas in 19 minutes instead of the expected 5 hours. Community investigation on GitHub issue #41930 identified four overlapping root causes hitting simultaneously:
- Peak-hour throttling: Confirmed by Anthropic on March 26
- Prompt-caching bugs: Silently inflating token costs 10-20x
- Session-resume bugs: Triggering full context reprocessing
- Stealth promotion expiry: The 2x off-peak usage promotion expired March 28 with no announcement
— GitHub user, Issue #41930Three days, no response. No blog post, no email to subscribers, no status page entry. Nothing that would tell a paying customer what is happening, why, or when it will be resolved.
Anthropic eventually acknowledged they’d miscalculated demand. The Register reported Anthropic admitting quotas were running out “way faster than expected.” DevOps.com and MacRumors covered it. The developer community’s tone shifted from enthusiasm to frustration to outright anger.
The Cache Bug Smoking Gun
Here’s where it gets ugly. Because Claude Code ships as a closed-source binary, the community had to reverse-engineer it to find out what was actually happening. And what they found is worse than anyone guessed. (It’s now easier to verify, since Anthropic accidentally leaked the full 512,000-line source via npm on March 31, but most of this investigation happened before that.)
A Reddit user, skibidi-toaleta-2137, spent days cracking open the 228MB Claude Code ELF binary with Ghidra, radare2, and a MITM proxy. They found two bugs that silently multiply token consumption by 10-20x:
- The billing string bug: The CLI runs a find-and-replace on every API request, looking for a
cch=billing hash. If your chat history happens to mention billing-related content, the replace hits the wrong spot and permanently invalidates the cache prefix. Yes. Complaining about your bill makes your bill worse (Issue #40652). - The session resume bug:
--resumeand--continueinject tool attachments in a different position than fresh sessions, invalidating the entire conversation cache and forcing full reprocessing (Issue #42749).
Uncached tokens cost 10-20x more against your quota than cached ones. These bugs were burning through quotas silently for weeks.
Then a followup analysis surfaced something even worse. Someone parsed 119,866 API calls from their local ~/.claude/projects/ JSONL files spanning January to April. The data showed Anthropic silently changed the prompt cache TTL default from 1 hour to 5 minutes between February 27 and March 8. No changelog. No announcement.
— Issue #46829March 6 is when 5m tokens first reappear after 33 days of clean 1h-only behavior. By March 8, 5m tokens outnumber 1h by 5:1. This is consistent with a server-side configuration change being rolled out gradually then completing around March 8.
For Claude Code’s primary use case, long coding sessions, a 5-minute TTL means rebuilding cache constantly. The Redditor’s conclusion: “The silent reversion to 5m TTL in March is the most likely explanation for why subscription users began hitting their 5-hour quota limits for the first time.”
— Theo Browne on XClaude Code being closed source is the biggest bag fumble in the AI era. If CC was on Github, these things would be trivial to identify and fix. Instead we’re stuck reverse engineering their incompetence.
Anthropic silently changed infrastructure defaults that multiplied customer costs 10-20x. They said nothing. Users had to reverse-engineer a binary to prove it. When the source leaked, Anthropic fired off 8,100 DMCA takedowns, accidentally hitting legitimate forks of their own public repos. That’s not a communication gap. That’s a trust problem.
The OpenClaw Escalation
In January, Anthropic blocked third-party tools from subscriptions. I wrote about it at the time. April was the sequel.
On April 4, Anthropic formally cut third-party agent frameworks from flat-rate plans. OpenClaw, Cursor, and others now bill through “extra usage” at API rates. For heavy users, that means costs jumping from $200/month to potentially thousands.
— Peter Steinberger, OpenClaw creator (1.3M views on X)First they copy some popular features into their closed harness, then they lock out open source.
Steinberger tried to negotiate. He and Dave Morin lobbied Anthropic directly. Best they managed was delaying the cutoff by a week.
Then on April 10, Anthropic temporarily banned Steinberger’s account entirely. His API key was revoked and Claude CLI access cut off while he was debugging OpenClaw’s claude -p fallback feature. The ban was reversed within hours after it went viral, but the message was clear.
The HN thread hit 1,098 points and 827 comments. The debate split predictably:
- goosejuice: “This is about Anthropic subsidizing their own tools to keep people on their platform”
- jmalicki: “An OpenClaw user can use 6, 7, 8 times what a human subscriber is using”
Both valid. Neither excuses the execution.
The Automated Ban Machine
While developers fought billing surprises and access restrictions, Anthropic’s automated enforcement systems were running hot on a different front entirely.
According to Anthropic’s own transparency report (published January 29, 2026):
- 1.45 million accounts disabled in H2 2025
- 52,000 appeals filed
- 1,700 overturned: a 3.3% success rate
That means 96.7% of appeals fail. Even accounting for legitimate bans, the sheer volume of false positives at that scale is staggering. A March 11 outage caused thousands of users to believe they’d been banned when it was just a service disruption. The April 6 outage repeated the pattern.
And then there’s the child detection classifier.
Users on Reddit are reporting that Anthropic’s automated systems are flagging adult accounts as being “used by a child.” The notification is blunt: your account is paused, here’s a link to verify your age, it expires in 30 days. Miss the window and you lose your appeal rights entirely.
Anthropic hasn’t disclosed what “signals” trigger the child detection. Given the 3.3% appeal overturn rate across all ban types, the system clearly prioritizes enforcement speed over accuracy. If you get flagged, appeal immediately: the 30-day window is real.
The Pattern
Zoom out and the pattern is clear. This isn’t one bad month. It’s a company making the same choice repeatedly:
- Economics get uncomfortable: Token costs are real. Third-party tools amplify usage. 1M context is expensive.
- Enforcement comes first: Block access, change billing, ban accounts. Do it fast. Do it quietly.
- Communication comes last: No changelogs. No emails. No status page updates. Users find out when their workflows break or their credit card gets charged.
- Appeals are a formality: 3.3% overturn rate. That’s not a review process. That’s a rubber stamp.
This isn’t malice. Anthropic genuinely has infrastructure constraints and cost pressures. But the gap between their “responsible AI” branding and their actual customer treatment is widening fast.
— HN commenter on the OpenClaw threadThere seem to be a ton of people who don’t understand how subscription services work.
Sure. And there seem to be a ton of companies that don’t understand how trust works.
What To Do
If you’re a paying Anthropic customer:
- Disable extra usage unless you explicitly need it. The overflow mechanism is a billing trap by design.
- Check your environment variables: If
ANTHROPIC_API_KEYis set, Claude Code may be billing to your API account instead of your subscription. - Monitor the GitHub issues: Anthropic’s GitHub is now the de facto status page. More information surfaces there than through any official channel.
- Diversify your model dependencies: I wrote about building exits in January. That advice aged well.
I still use Claude Code daily. I still think it’s the best agentic coding tool available. But I’m also the guy who just got silently charged $100 for using a feature that was advertised as included in my plan.
Trust is a subscription too. And Anthropic’s been running up extra usage on everyone’s account.


