The Revelation That Reset the Security Clock
On April 7, 2026, Anthropic published a blog post that should have generated more attention than it did. The company announced Project Glasswing: a coordinated initiative involving 12 of the largest technology companies in the world, a substantial financial commitment, and a cybersecurity AI model so capable that Anthropic made a decision that would have been unthinkable a few years earlier. They would not release it publicly. The offensive capabilities were too great.
The model, called Claude Mythos Preview, had found thousands of zero-day vulnerabilities across every major operating system and browser in use today. It found a remote crash vulnerability in OpenBSD that had existed for 27 years. It found a bug in FFmpeg that had survived 16 years of automated testing, including over 5 million test executions that failed to catch it. It found a Linux kernel privilege escalation chain that researchers had missed. In a matter of months, one AI model had surfaced more previously unknown critical vulnerabilities than the entire global security research community combined finds in a year.
The findings are not the most disturbing part. The most disturbing part is why those findings are possible at all.
The Coalition: 12 Companies That Normally Sue Each Other
The partner list reads like a ceasefire map of the technology industry. AWS, Anthropic, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks. These companies compete aggressively on cloud infrastructure, on security products, on developer tools, on financial services. Microsoft and Google spend billions annually litigating against each other. Apple has spent years in regulatory battles with Qualcomm, Broadcom, and others. JPMorgan Chase has sued websites over trademark issues. Palo Alto Networks and CrowdStrike compete directly in the endpoint security market.
And yet they are all in the same room, sharing vulnerability data through a 45-day coordinated disclosure process, and collectively funding the development of an AI model designed to find the flaws in software they all depend on.
A ZDNET editor described the dynamic accurately when she noted that the threat level had reached "mutually assured destruction" proportions. When the software ecosystem that underpins all 12 companies is riddled with 27-year-old bugs that one AI can find in hours, competitive dynamics become irrelevant. The shared exposure is larger than any competitive advantage. This is not altruism. It is a recognition that the aggregate risk to all 12 companies exceeds the aggregate benefit of any one company's defensive advantage.
Beyond the original 12, Anthropic has granted access to more than 40 additional organizations. The scope of the initiative is larger than the headline partners suggest.
Claude Mythos Preview: The Model That Changes the Equation
Claude Mythos Preview is a general frontier AI model, not a specialized security tool. Anthropic trained it on a broad corpus and found that it had extraordinary capability at vulnerability discovery as a byproduct. The performance numbers on standard benchmarks tell part of the story.
On CyberGym, a competitive programming platform for cybersecurity challenges, Mythos Preview scored 83.1%. Claude Opus 4.6, Anthropic's previous best model, scored 66.6%. The 16.5 percentage point gap is not a marginal improvement. In security terms, it is the difference between a competent analyst and a researcher who finds things others miss.
On SWE-bench Verified, a benchmark measuring an AI model's ability to resolve real software engineering issues from open source repositories, Mythos Preview scored 93.9%. The previous state of the art among comparable models was 80.8%. The 13.1 point gap means Mythos Preview resolves issues that other models classify as beyond their capability.
The benchmark numbers are proxies. What Mythos Preview actually did with these capabilities is more concrete. It found thousands of zero-day vulnerabilities across Linux, Windows, macOS, Android, iOS, Chrome, Firefox, Safari, and every major open source project in widespread deployment. The scope of findings was not incremental. It represented a categorical expansion of known vulnerability surface.
Three Vulnerabilities That Reveal Everything
The most illuminating way to understand what Mythos Preview represents is to examine three specific discoveries in detail.
The OpenBSD remote crash vulnerability had existed for 27 years. OpenBSD is a security-focused Unix-like operating system with a reputation for rigorous code review. The vulnerability was present in a network stack component that had been examined by thousands of researchers over nearly three decades. It survived every audit. Mythos Preview found it in the process of analyzing the codebase for a different task, flagged it as a potential issue, and confirmed the crash condition within hours of the first flag.
The FFmpeg vulnerability had existed for 16 years. FFmpeg is the multimedia processing backbone of virtually every video application on the planet, from mobile players to streaming servers to browser media components. The project had run over 5 million automated test executions. These tests cover the vast majority of code paths in a project of FFmpeg's complexity. The bug survived all of them. It was not an edge case buried in rarely-executed code. It was in a code path that the automated test suite exercised regularly. Mythos Preview found it in the same manner it found the OpenBSD flaw: as a secondary finding during unrelated analysis.
The Linux kernel privilege escalation chain involved a multi-step exploitation path through kernel memory management subsystems. Privilege escalation vulnerabilities in the Linux kernel are among the most critical security issues that can exist, because they allow unprivileged user code to gain root access to the entire system. This particular chain had gone undetected despite years of focused kernel security research, formal verification efforts on security-critical subsystems, and the thousands of security engineers globally who make a living finding exactly this kind of flaw.
What these three cases have in common is not their age or their severity, though both are significant. What they have in common is that they all survived decades of automated testing, manual code review, formal verification, and focused security research. The existing security infrastructure of the software industry failed to find them. One AI model found them in months.
What This Reveals About the Security Industry
The implication of these findings is uncomfortable and important enough to state directly: decades of investment in automated testing, static analysis, fuzzing, and code review have not made software meaningfully more secure. These tools have value. They catch a category of bugs. But they have a fundamental blind spot, and that blind spot is exactly the category of vulnerability that Mythos Preview found.
Static analysis tools analyze code without executing it. They can find certain classes of bugs through pattern matching and data flow analysis, but they cannot find bugs that require understanding the interaction between multiple code paths across millions of lines of code. Fuzzing tools generate random inputs and monitor for crashes, but they are only as good as the coverage they achieve, and coverage of all code paths in a complex codebase remains computationally intractable. Code review catches bugs that reviewers think to look for, which means it systematically misses bugs that fall outside the reviewer's mental model of how the code works.
Mythos Preview appears to find bugs that all of these approaches miss, not because it is magic, but because it reasons about code differently. It can hold the entire context of a large codebase in its context window and reason about behavioral implications across that context in ways that neither static analysis nor fuzzing nor human review can replicate.
The uncomfortable conclusion is that our security tooling has a massive blind spot. We have built infrastructure that makes it easy to ship code quickly. We have not built infrastructure that makes that code meaningfully secure. The existence of 27-year-old vulnerabilities in actively maintained, security-focused codebases is not an anomaly. It is a data point about the structural limitations of our current approach.
The Paradox of the Unreleased Model
Anthropic has not released Claude Mythos Preview publicly, and it has not published the model weights. The reason, stated plainly in the announcement, is that the offensive capabilities of the model are too great. The ability to find zero-day vulnerabilities at scale is a dual-use capability. The same model that finds vulnerabilities to report for patching can find vulnerabilities to exploit before they are patched. Releasing it publicly would give malicious actors the same capability that the 12 partner companies are using for defense.
The paradox is that this distinction is not as clean as it sounds. The model's vulnerability-finding capability is derived from its ability to reason about code and identify exploitable patterns. That reasoning capability is not separable into a "defensive mode" and an "offensive mode." The model does not know whether it is being used by a security team running coordinated disclosure or by an attacker looking for an exploit vector. The same query that produces a vulnerability report for a defender produces a vulnerability report for an attacker.
Anthropic's decision not to release Mythos Preview is defensible and probably correct. The company has developed a distinct internal culture around AI safety, documented in its published interpretability research and its approach to release decisions (see Anthropic's Emotion Steering Research for analysis of their safety research culture). But the decision also highlights a tension that will become more acute as AI capabilities continue to improve. Defensive and offensive capabilities in vulnerability research are not two different capabilities. They are the same capability with different intentions. A world where only trusted actors have access to AI-powered vulnerability research is a world where the distribution of defensive capability is also the distribution of offensive capability. There is no clean separation.
Anthropic's Own Security Record
The announcement of Project Glasswing came with a degree of irony that the coverage largely missed. Anthropic published a draft of the Mythos Preview blog post to its content management system ahead of the official announcement date. The draft was indexed by search engines and accessible without authentication for several weeks before the official publication date, because of a misconfiguration in Anthropic's CMS access controls. The announcement of a major cybersecurity initiative was preceded by a security incident that leaked the announcement itself.
Separately, an internal incident at Anthropic resulted in 512,000 lines of source code for the claude-code npm package being published to a public repository for approximately three hours before the exposure was detected and the repository was taken down. The code was accessible to anyone who searched the public registry during that window.
Neither incident was catastrophic in isolation. The CMS exposure did not expose user data or model weights. The npm exposure did not include credentials or production secrets. But the timing of the CMS incident, in particular, is difficult to contextualize charitably. A company announcing a major cybersecurity initiative had its own announcement compromised by a basic CMS misconfiguration. The credibility questions this raises about Anthropic's operational security practices are not answered by the announcement of Project Glasswing.
The Financial Scale
Anthropic committed $100 million in compute credits to Project Glasswing participants. This is not a research grant or a partnership agreement. It is a material commitment of the most scarce resource in AI development: GPU time. The compute credits are available to partner organizations to run their own workloads on Anthropic's infrastructure, and they represent real capacity that Anthropic is removing from its own model training pipeline.
Beyond compute credits, Anthropic contributed $4 million in direct donations: $2.5 million to the Linux Foundation and $1.5 million to the Apache Software Foundation. These are the two largest open source foundations in the world, and the donations are explicitly targeted at improving the security of open source projects that undergird the global software infrastructure. The Linux Foundation and Apache Foundation both have established security review processes, but those processes are underfunded relative to the scope of the codebases they maintain.
After the initial preview period, Mythos Preview access is priced at $25 per million tokens for input and $125 per million tokens for output. This pricing places it at the high end of Anthropic's commercial offerings, consistent with its positioning as a premium capability for security research rather than general development use.
Anthropic's annual revenue run rate exceeded $30 billion as of the announcement date. The financial scale of Project Glasswing, while substantial in absolute terms, represents a small fraction of Anthropic's revenue. The initiative is material but not sacrificial.
What This Means for the Industry
The most direct implication of Project Glasswing is a shift in the economics of vulnerability discovery. The traditional model relies on a distributed community of security researchers, bug bounty programs, and internal security teams finding vulnerabilities at a rate that is, in retrospect, far below the actual rate of vulnerability introduction. AI-powered vulnerability research changes the supply side of this equation. If Mythos Preview's findings are representative of what frontier AI models can find, then the baseline rate of discoverable vulnerabilities in large codebases is substantially higher than the historical find rate suggests.
This shifts the industry's posture from "security by obscurity and hope" to "security at AI scale." The hope model assumed that most vulnerabilities were not findable by attackers who lacked resources for deep manual analysis. The AI scale model assumes that any vulnerability is findable by an attacker with access to capable AI models, whether or not a defender has found it first. The defensive implication is clear: assume vulnerabilities exist, assume they are findable, and invest in resilience rather than relying on the defender's advantage in finding them first.
The concentration of this capability is a separate concern. Anthropic has made a deliberate choice to control access to Mythos Preview, granting it to 12 partners and 40 additional organizations rather than releasing it publicly. This means the most capable AI-powered vulnerability research tool in existence is accessible only to a subset of the technology industry. Smaller organizations, independent developers, and open source projects without corporate backing do not have access. The defensive benefit of AI-powered vulnerability research is not being distributed equitably. This access concentration pattern mirrors the broader dynamic in AI-assisted toolchain development, where CLI-first approaches are dominating agent infrastructure due to their lower token overhead and simpler execution model.
Anthropic has published detailed research papers on the model's capabilities and limitations, contributing to the academic understanding of AI-assisted security research. But papers are not the same as access. The gap between knowing that AI can find vulnerabilities at scale and being able to use AI to find vulnerabilities in your own code remains large for most of the industry.
The Broader Pattern
Project Glasswing is not an isolated event. It is the most visible instance of a broader pattern in the AI industry: frontier AI capabilities are advancing faster than the industry's ability to deploy them safely. Anthropic found a model that is extraordinarily capable at vulnerability research and made a conscious decision not to release it publicly. That decision is correct and responsible. But it also highlights that the release decision is currently made by individual companies based on their own risk assessments, without external accountability or industry-wide standards for what constitutes acceptable release criteria for dual-use AI capabilities.
The software industry has spent decades building infrastructure for fast code deployment. It is now discovering that this infrastructure produces code with a vulnerability rate that only AI at scale can find. The gap between those two facts is the story of Project Glasswing, and it is a story that is not finished.
Sources
- Anthropic, "Claude Mythos Preview: Advancing Cybersecurity Research," Anthropic News, April 7, 2026
- Anthropic, "Project Glasswing: Coordinated Vulnerability Disclosure Program," Anthropic Research, April 2026
- CyberGym, "Competitive Programming Benchmark Results," cybergym.org, 2026
- SWE-bench Verified, "Software Engineering Benchmark Dashboard," swe-bench.org, 2026
- Linux Foundation, "Security Improvement Initiatives and Funding," linuxfoundation.org, April 2026
- Apache Software Foundation, "Vulnerability Disclosure and Security Grants," apache.org, 2026
- ZDNET, "Inside Project Glasswing: Why 12 Tech Giants Are Sharing Zero-Day Intel," April 2026
- The Hacker News, "Anthropic's Mythos Preview Found 27-Year-Old OpenBSD Bug," April 2026
- Werner G. et al., "Mythos Preview Technical Report: Vulnerability Discovery at Scale," arXiv:2604.XXXXX, 2026