Claude Mythos zero-daysAI securityvulnerability disclosureClaude Mythos 10000 zero-days patching crisisAI model finds zero-day vulnerabilities faster than humans patch

Claude Mythos Finds 10,000 Zero-Days, Patching Pipeline Breaks

24 May 20266 min readAlex Drover

// IN THIS ARTICLE

01What Happened 02Technical Anatomy 03Who Gets Burned 04Playbook for Security Teams 05Key Takeaways 06Frequently Asked Questions

Anyone who has ever shipped a CVE advisory at 11pm on a Friday knows the bottleneck isn't finding bugs, it's getting humans to triage, confirm, and patch them. Anthropic just published numbers that turn that bottleneck into a structural crisis. In one month, an unreleased model found more critical zero-days than most vendors process in a year.

The disclosure landed on May 23, 2026, and the math underneath it is what should worry every platform lead reading this.

What Happened

Anthropic unveiled the first-month results of Project Glasswing, a defensive consortium built around an unreleased model called Claude Mythos Preview. As CyberSecurityNews reported, the model autonomously discovered more than 10,000 high- and critical-severity zero-day vulnerabilities across production codebases belonging to over 50 partner organizations, including Microsoft, Apple, Google, and Cloudflare.

Cloudflare alone reported 2,000 bugs surfaced by the model, 400 of them high or critical, and noted the false-positive rate beats human testers. Mozilla used Mythos Preview to find and patch 271 vulnerabilities in Firefox 150, ten times what previous testing with Claude Opus 4.6 produced. The UK's AI Security Institute said Mythos Preview is the first model to fully solve its multistep cyberattack simulations.

Anthropic also pointed the model at over 1,000 open-source projects. One finding, CVE-2026-5194, was a critical flaw in the wolfSSL cryptography library that allowed certificate forgery. The model didn't just spot the bug, it built a working exploit capable of silently spoofing banking or email domains.

Then the gap shows up. The initial scan produced 23,019 candidate findings. External firms reviewed 1,900 of them and confirmed 1,726 as true positives, a 90.8% rate. Anthropic forwarded 1,596 vetted findings to maintainers. Only 97 have been patched upstream. Only 88 advisories published.

Citing dual-use risk, Anthropic is keeping Mythos out of public release and restricting it to consortium members. For everyone else, there's Claude Security in public beta, running on Opus 4.7, which has reportedly helped patch over 2,100 corporate vulnerabilities.

Technical Anatomy

The headline number is 10,000 zero-days. The interesting number is 90.8%. That confirmation rate, on 1,900 audited samples, says the model isn't just generating plausible-looking alerts. It's producing findings that survive adversarial review by paid external security firms. Cloudflare's claim that the false-positive rate beats human testers is consistent with that.

The wolfSSL case shows what changed. Static analyzers find suspicious patterns. Human researchers find bugs and write exploits over weeks. Mythos Preview did both in one pass: identified the certificate forgery primitive and constructed a functional exploit. The economic cost of producing a weaponized zero-day against a widely deployed crypto library just collapsed toward zero.

That collapse breaks the assumption underneath the 90-day coordinated disclosure window. The window exists because finding a bug was expensive, so defenders had a head start. Once discovery is cheap, the window becomes a synchronized starting pistol for anyone with comparable tooling. And there will be comparable tooling. Anthropic is explicitly considering a future Mythos-class release.

Look at the funnel: 23,019 candidates, 1,596 vetted reports sent, 97 patches landed. That's a 6% patch rate against reported findings, not against raw scan output. The triage capacity of upstream maintainers, mostly volunteers, is the choke point. AI-driven discovery is producing high-quality reports at a cadence that human reviewers, CI pipelines, release engineers, and downstream package consumers physically cannot absorb.

Anthropic is trying to widen the funnel with what it gives Cyber Verification Program partners: specialized skills, codebase-mapping harnesses, automated threat model builders. Cisco open-sourced its Foundry Security Spec so defenders can build AI-assisted evaluation systems. Useful, but the bottleneck isn't evaluation. The bottleneck is the human maintainer of a transitively-depended-on library who has a day job and now has 40 well-written vulnerability reports in their inbox.

My take: the disclosure pipeline was designed for an era when bug-finding was the scarce resource. That era ended this month.

Who Gets Burned

Open-source maintainers first. The 97-patches-from-1,596-reports gap isn't a story about lazy projects. It's a story about volunteer humans being asked to scale linearly against AI throughput. In production incidents I've seen, the worst outages trace back to unpatched transitive dependencies, not the top-level library a team actually audits. If Mythos-class scanning becomes standard, every iGaming platform and fintech app running Node, Python, or Go is sitting on a stack where the maintainer queue is now the binding constraint.

iGaming operators with PCI scope and live-money flows are exposed twice: through their own codebases and through the payment, KYC, and crypto libraries underneath them. wolfSSL is the canary. A certificate forgery primitive in any widely deployed TLS or signing library is a direct path to invisible MITM against deposit and withdrawal endpoints. Teams I've worked with at European operators run 200-plus npm and OS packages in their hot path. If even 1% of those get a Mythos-style disclosure with no upstream patch for weeks, the security team owns the mitigation.

Fintechs face the same problem with sharper regulatory teeth. DORA and PSD3 reviewers will ask why a known, vetted finding sat unpatched. "Maintainer hadn't merged it yet" is not an answer that survives a regulator meeting.

Enterprise infra vendors that compete with Cloudflare, Microsoft, and Google are also burned, just quietly. Those four are inside the consortium. Everyone else is reading about it. That's a real competitive moat: 2,000 internal bugs found and fixed before disclosure pressure starts. The uncomfortable read: Project Glasswing is partly a security program, partly a cartel of who gets to harden first.

The CVE-2026-5194 disclosure also reframes threat modeling for any team that assumes TLS pinning and standard certificate validation are sufficient.

Playbook for Security Teams

Stop treating "patch when upstream patches" as a strategy. With only 97 of 1,596 vetted findings landing upstream, the assumption that the ecosystem will patch faster than attackers weaponize is dead for the next 12 months. Build compensating controls now.

Concrete actions for this week:

Inventory your dependency tree against the 1,000 open-source projects Anthropic scanned. If you don't know which of your libraries are in scope, assume the popular ones are.
Tighten default configurations on TLS, certificate validation, and any code path that consumes external signatures. The wolfSSL exploit class will have siblings.
Enforce MFA everywhere, including service-to-service where SPIFFE or mTLS is an option. The 90-day disclosure window is now a hostile timer, not a friendly one.
Wire behavioral analytics into your detection stack to cut mean time to detect. If you can't prevent exploitation of an unpatched library, detect the post-exploitation behavior.
Map your exposure to MITRE ATT&CK initial-access and credential-access techniques that benefit most from certificate forgery and library-level RCE.
Evaluate Claude Security or equivalent AI-assisted scanners against your own codebase. The 2,100 corporate vulnerabilities patched figure suggests it pays for itself fast on any non-trivial monorepo.

If you maintain an open-source project your platform depends on, allocate paid engineering time to it. Treating critical dependencies as free is the bet that breaks here. A maintainer drowning in AI-generated reports is your incident, not theirs.

Key Takeaways

Claude Mythos Preview found over 10,000 high- and critical-severity zero-days in one month with a 90.8% confirmed true-positive rate, ending the era of expensive bug discovery.
Only 97 of 1,596 vetted upstream findings have been patched, exposing maintainer triage capacity as the new systemic risk.
CVE-2026-5194 in wolfSSL is a preview of the threat class: critical crypto library bugs with model-generated working exploits.
The 90-day coordinated disclosure window assumes attackers can't independently rediscover bugs cheaply. That assumption no longer holds.
Consortium membership (Microsoft, Apple, Google, Cloudflare, Mozilla) is now a meaningful security moat against everyone outside it.

Frequently Asked Questions

Q: What is Project Glasswing and why does it matter?

Project Glasswing is Anthropic's defensive cybersecurity consortium of over 50 organizations, including Microsoft, Apple, Google, and Cloudflare, using the unreleased Claude Mythos Preview model to find vulnerabilities in critical infrastructure. It matters because the model found more than 10,000 high- and critical-severity zero-days in its first month, demonstrating that AI-driven vulnerability discovery has outpaced the human capacity to triage and patch.

Q: Why is Anthropic keeping Claude Mythos Preview restricted?

Anthropic cited severe dual-use risks. The model doesn't only identify flaws, it autonomously constructs functional exploits, as it did for CVE-2026-5194 in wolfSSL. Releasing it publicly would hand the same capability to attackers, so access is limited to defensive consortium members while Anthropic offers Claude Security on Opus 4.7 as a public-beta alternative for enterprise clients.

Q: What should security teams do about the patching gap?

With only 97 of 1,596 vetted findings patched upstream, teams should stop relying on timely upstream patches and invest in compensating controls: strict default configurations, MFA everywhere, behavioral analytics to reduce mean time to detect, and dependency inventories tied to active scanning. Funding the maintainers of critical open-source libraries is also moving from optional to operational necessity.

Alex Drover

RiverCore Analyst · Dublin, Ireland

// RELATED ARTICLES