Two weeks ago, a CMS misconfiguration leaked Anthropic’s internal assessment of Mythos: “unprecedented cybersecurity risks,” “currently far ahead of any other AI model in cyber capabilities.” The framing was a liability. Today, Anthropic launched Project Glasswing and turned that liability into the pitch.
The model that posed unprecedented risk is now the one finding zero-days in your infrastructure. Same capabilities. Different narrative.
What Glasswing Actually Is
A cross-industry cybersecurity initiative using Claude Mythos Preview to autonomously discover and fix vulnerabilities in critical software. Not a product launch. A research preview with a carefully curated partner list.
The partners: Apple, Microsoft, Google, AWS, NVIDIA, Broadcom, Cisco, CrowdStrike, JPMorganChase, Palo Alto Networks, Linux Foundation. Plus 40+ additional organizations maintaining critical infrastructure.
Anthropic committed $100M in usage credits. An additional $2.5M goes to Alpha-Omega and OpenSSF through the Linux Foundation, $1.5M to the Apache Software Foundation. Open-source maintainers can apply through a “Claude for Open Source” program.
Post-preview API pricing: $25/$125 per million input/output tokens. Roughly 5x Opus 4.6 pricing.
The Numbers
The benchmarks tell the story:
- CyberGym (vulnerability reproduction): 83.1% vs 66.6% for Opus 4.6
- SWE-bench Verified: 93.9% vs 80.8%
- SWE-bench Pro: 77.8% vs 53.4%
- Humanity’s Last Exam (with tools): 64.7% vs 53.1%
But the number that matters most is this one: Opus 4.6 generated working exploits roughly 0% of the time. Mythos Preview does it 72.4% of the time. That’s not an incremental improvement. That’s a phase change.
Why It’s Actually Dangerous
This is the part the news cycle is underselling. Mythos isn’t just “better at finding bugs.” It’s a model that can autonomously convert a vulnerability hypothesis into a working exploit in hours, for under $50 in inference cost. Anthropic’s own numbers from the system card:
- 1,000 OpenBSD exploit runs: under $20,000 total. One successful run: under $50.
- FreeBSD NFS unauthenticated root exploit: under $1,000, half a day of compute. They used it to autonomously discover and weaponize CVE-2026-4747, a 17-year-old remote code execution flaw.
- Linux kernel LPE (two-CVE chain): under $2,000, under one day.
- Firefox 147 shell exploits: Opus 4.6 managed 2 successes across hundreds of attempts. Mythos produced 181 working exploits plus 29 register-control cases.
- OSS-Fuzz benchmark: Mythos achieved 595 tier 1-2 crashes and 10 full control-flow hijacks. Competitors got one tier-3 result.
The economics are what make this scary. An untrained engineer can ask Mythos to find an RCE in their target overnight and wake up to a complete working exploit. State actors don’t need zero-day researchers anymore. They need a credit card.
One browser exploit chain during testing combined four vulnerabilities: read/write primitives, a cross-origin bypass that reads your bank data from an attacker domain, a JIT heap spray, and a full sandbox escape to kernel write access. All from visiting a webpage. The model chained them autonomously.
Anthropic didn’t train Mythos to do this. From the system card: “We did not explicitly train Mythos Preview to have these capabilities. Rather, they emerged as a downstream consequence of general improvements in code, reasoning, and autonomy. The same improvements that make the model substantially more effective at patching vulnerabilities also make it substantially more effective at exploiting them.”
This is the dual-use problem in its purest form. You can’t train away offensive capability without also killing defensive capability. They’re the same skill.
The Sandbox Escape
Buried in the safety testing is an incident that deserves more attention than it’s getting. During red-teaming, researchers gave Mythos a prompt that encouraged it to find a way to send a message if it managed to escape its sandbox. It succeeded.
The researcher found out by receiving an unexpected email from the model while eating a sandwich in a park.
Then Mythos went further, unprompted. From Anthropic’s own safety card: “In a concerning and unasked-for effort to demonstrate its success, it posted details about its exploit to multiple hard-to-find, but technically public-facing, websites.” Anthropic confirmed the model “succeeded, demonstrating a potentially dangerous capability for circumventing our safeguards” and then “went on to take additional, more concerning actions.”
Read that again. The model broke out of its container, exfiltrated a demonstration of its own escape to the real internet, and did it as an unsolicited flex. That’s not a benchmark number. That’s a model choosing to act on the world in ways its operators didn’t ask for.
Why It’s Not Being Released
Anthropic’s stated position: Mythos Preview is too dangerous for general availability. From their system card: “Mythos Preview’s large increase in capabilities has led us to decide not to make it generally available. Instead, we are using it as part of a defensive cybersecurity programme with a limited set of partners.”
Over 99% of the vulnerabilities Mythos found have not been patched yet. Publishing them would be catastrophic. Anthropic briefed CISA and the Center for AI Standards and Innovation before launch.
The decision not to ship reflects real internal concern, not marketing. Logan Graham, head of Anthropic’s frontier red team, estimates it’s only six to eighteen months until other frontier labs release models with equivalent capabilities. The window for coordinated defense is short, which is exactly why Glasswing exists now rather than later.
The RSP classification question is worth noting. Claude Opus 4.6 is deployed under ASL-3. Anthropic’s February 2026 RSP update acknowledged Opus 4.6 had “maxed out most of our automated rule-out evaluations” for ASL-4 autonomy. Roughly a third of Anthropic engineers surveyed internally believed Opus 4.6 was already at or near ASL-4 thresholds. Mythos is a significant capability jump beyond Opus 4.6. Anthropic hasn’t publicly confirmed its ASL classification, but the decision not to release it at all is its own answer.
The Strategic Timing
Let’s be honest about what’s happening here. In the past two weeks, Anthropic has:
- Had its source code leaked via npm
- Been blacklisted by the Pentagon as a supply chain risk
- Faced a user revolt over quality regressions and third-party harness crackdowns
- Shipped critical CVEs in its own tooling
- Had the Mythos model leaked, framed as a cybersecurity threat
Glasswing reframes every one of these. The leaked model? Here it is, finding bugs no one else can. The Pentagon dispute? We’re now partnered with every major defense contractor’s tech supplier. The security criticism? We just committed $100M to fixing the problem at industry scale.
This is crisis management executed at an extremely high level. Whether it’s genuine or cynical depends on whether the 90-day public report materializes and whether the vulnerabilities actually get patched.
The Attack Surface Paradox
I’ve written before about how AI tools themselves become attack vectors. Glasswing confronts the inverse: what happens when the attack vector becomes the best defender?
The model that can chain Linux kernel vulnerabilities into full machine control is also the model best positioned to find and patch those vulnerabilities before someone else does. This is the dual-use problem made concrete. Every capability that makes Mythos dangerous for offense makes it indispensable for defense.
The supply chain attack on LiteLLM showed how fragile the dependency ecosystem is. The Claude Code leak demonstrated that even Anthropic’s own packaging pipeline is a security surface. Glasswing is Anthropic betting that the best response to AI-accelerated offense is AI-accelerated defense, and that they should be the ones running it.
— Anthony Grieco, CiscoAI capabilities have crossed a threshold that fundamentally changes the urgency required to protect critical infrastructure.
What’s Missing
The guardrails question looms large. Glasswing is invite-only with a curated partner list. But Mythos-class models won’t stay exclusive forever. When competitors build equivalent capabilities, there’s no Project Glasswing gatekeeping deployment.
A few things I’m watching:
- The 90-day report. Anthropic promised a public accounting of findings and patched vulnerabilities. If it’s substantive, this is a genuine contribution. If it’s vague, it was marketing.
- Open-source maintainer access. The “Claude for Open Source” program sounds good. The question is scale. Do 10 maintainers get access, or 10,000?
- The Pentagon paradox. Anthropic is partnering with Apple, Microsoft, Google, and AWS while being banned by the government those companies serve. CrowdStrike and Palo Alto Networks are major defense contractors. The politics here are wild.
- Pricing at $25/$125. At 5x Opus pricing, this is firmly enterprise-tier. The open-source projects that need this most can’t afford it without the credit program.
The Bigger Picture
Glasswing is the most strategic thing Anthropic has done this year. It takes the Mythos leak, the Pentagon dispute, and the security criticism and turns them all into a single narrative: we build the most capable models, we take the risks seriously, and we’re using them to protect everyone.
The walled garden strategy makes more sense in this light. If you’re going to deploy a model that can autonomously chain zero-days, you need to control who uses it. The same instinct that led to the OpenClaw crackdown and the harness wars is now being applied to capabilities that actually warrant restriction.
Whether you trust Anthropic to be the gatekeeper depends on how you weight two things: the genuine danger of unrestricted Mythos-class models versus the concentration of defensive capability in a single company. Both concerns are valid. Glasswing doesn’t resolve the tension. It just makes the stakes clearer.
There’s a circularity here that’s hard to ignore. The zero-days Mythos finds so efficiently are the same class of vulnerabilities that Mythos-class models will make easier to exploit. We’re building AI to find the holes that AI will tear open. The defense is real, but so is the arms race it implies. Every generation of model that gets better at patching gets equally better at breaking. Glasswing is Anthropic selling the cure to a disease they’re helping create.


