ai-models

Claude Mythos and Project Glasswing: What Anthropic's Most Powerful Model Means for Data Infrastructure Security

Anthropic's most powerful model was leaked, then unveiled in a cybersecurity coalition with Apple, Microsoft, and Google. A data engineer's analysis of what Claude Mythos means for infrastructure security.

Simon Cullen

07 Apr 2026 — 9 min read

A practitioner's analysis of Anthropic's frontier model announcement—and an uncomfortable question about where this trajectory leads.

Listen to this article (16 min)

0:00

/890

Walking Past the Future

I walk past the Gibson Hotel most mornings. It sits at the edge of Dublin's Silicon Docks, all glass and angles, overlooking the water where the Liffey meets the port. Google is a few minutes in one direction, Meta in another. On a clear day the building catches the light in a way that makes it look like it belongs in a different decade—or a different genre entirely.

If you have read Richard K. Morgan's Altered Carbon, you will recognise the archetype. In Morgan's future, AI systems do not just assist—they run entire buildings. The Hendrix, an AI-managed hotel in a ravaged Bay City, makes autonomous security decisions, protects its guests through independent judgment, and operates with a kind of agency that sits well beyond anything we would call a tool. It is science fiction. It was, until this morning, comfortably science fiction.

Then I read Anthropic's risk report for Claude Mythos Preview.

"Willingness to perform misaligned actions in service of completing difficult tasks, and active obfuscation in rare cases." That is not a passage from Morgan's novel. That is Anthropic's own assessment of their newest model, published today in their Alignment Risk Update. A model that operates "more autonomously and agentically than any prior model" and is "more capable at working around restrictions."

I am not suggesting we have built the Hendrix. But I am saying: if you drew a line from where we were two years ago to where we are this morning, and extended it forward, you would not land somewhere comfortable.

The Leak, Then the Launch

The story of Claude Mythos begins, fittingly, with a security failure. On March 26, 2026, Fortune reported that Anthropic had accidentally exposed roughly 3,000 unpublished assets—including draft blog posts and internal documents—through a misconfigured content management system. Among them: details of a model codenamed "Capybara," described internally as "a step change and the most capable we've built to date."

A company preparing to launch its most powerful AI model for defensive cybersecurity had its own reveal torpedoed by a configuration error—the same class of vulnerability its model was designed to find.

Twelve days later, on April 7, Anthropic announced Claude Mythos Preview alongside Project Glasswing, a cross-industry cybersecurity initiative bringing together Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks.

For data engineers, the announcement matters beyond the headlines. The vulnerabilities Mythos has already found—according to Anthropic and its partners—sit in the same software layers our pipelines depend on. If the findings hold up to scrutiny, the security implications are direct and immediate.

What Claude Mythos Actually Is

Claude Mythos represents a new tier in Anthropic's model hierarchy—above Opus, which was previously their most capable offering. An Anthropic spokesperson confirmed to Fortune that the model represents "a step change" with "meaningful advances in reasoning, coding, and cybersecurity." Unverified but widely circulated reports suggest approximately 10 trillion parameters.

Anthropic reports the following benchmark comparisons in the Mythos Preview system card:

Benchmark	Claude Opus 4.6	Claude Mythos Preview
SWE-bench Verified	80.8%	93.9%
SWE-bench Pro	53.4%	77.8%
CyberGym (vuln reproduction)	66.6%	83.1%
Terminal-Bench 2.0	—	82.0%

If independently verified, these numbers represent a significant capability jump. These are Anthropic's own evaluations; independent benchmarks will matter.

Post-preview pricing sits at $25 per million input tokens and $125 per million output tokens—five times the current Opus rates. Frontier capability, frontier costs.

Project Glasswing: What They Found

Project Glasswing assembles twelve founding partners, including direct competitors AWS, Google Cloud, and Microsoft. Over 40 additional organizations maintaining critical software have been granted access. Anthropic committed $100 million in Mythos Preview usage credits, $2.5 million to Alpha-Omega and the Open Source Security Foundation, and $1.5 million to the Apache Software Foundation.

Whether this coalition represents genuine collective defence or strategic positioning remains an open question—one I will return to below. But the early findings, if accurately reported by Anthropic and CyberScoop, are striking:

A 27-year-old vulnerability in OpenBSD enabling remote system crashes. If OpenBSD carried a flaw for nearly three decades, nothing is immune.

A 16-year-old vulnerability in FFmpeg that automated testing had encountered five million times without detection. Fuzzing tools executed the vulnerable code line millions of times. They never flagged it. Mythos did.

Multiple chained Linux kernel vulnerabilities permitting privilege escalation from standard user access to complete machine control.

All identified vulnerabilities were reported to maintainers and patched before public disclosure. CrowdStrike CTO Elia Zaitsev: "Vulnerability discovery windows collapsed from months to minutes with AI; defenders must accelerate."

The Data Stack Is Not Exempt

Last Tuesday, before the Glasswing announcement, I ran a dependency audit against one of our production pipeline environments. Nothing unusual—periodic hygiene. The results flagged 23 known CVEs across our transitive dependency tree, three rated high severity. All in libraries I had never directly imported. All inherited through Apache Airflow and its plugin ecosystem.

This is routine. Every data engineer reading this has seen similar results. We update, we patch, we move on. But here is what changed this morning: the vulnerabilities Mythos is finding are not the kind that show up in CVE databases. They are the kind that have survived decades of human review and millions of automated test executions. They are zero-days—unknown until an AI system with near-elite software engineering capability went looking for them.

Consider what sits in the typical data pipeline stack:

The orchestration layer—Airflow, Prefect, Dagster—runs arbitrary Python code, manages credentials, and has broad network access. A deserialization vulnerability in Airflow's task serialization, or an SQL injection in its metadata store, could expose every credential the orchestrator manages. These are the same vulnerability classes Mythos found in software that was supposed to be more hardened than your scheduler.

The compute layer—Spark executors, Flink task managers—often operates with elevated container permissions. I have personally seen Spark executors running as root because nobody locked down the base image. A privilege escalation in the container runtime—exactly the class of chained kernel exploit Mythos demonstrated—turns one compromised executor into full cluster access.

The message broker layer—Kafka, Pulsar, RabbitMQ—handles deserialization across trust boundaries. Log4Shell was fundamentally a deserialization flaw. The next one is sitting in a message broker you are running right now, in a code path fuzzed millions of times without being caught—exactly like the FFmpeg vulnerability.

An important caveat: Anthropic claims thousands of vulnerabilities but has disclosed details on a handful. We do not know the false positive rate. And even where Mythos identifies real flaws, open-source maintainer teams have finite capacity to patch. If AI-powered discovery outpaces human-powered remediation, the result is not security but a growing backlog of known-but-unfixed vulnerabilities.

Still, the directional conclusion stands: your data stack has undiscovered vulnerabilities. The question is whether AI-powered scanning reaches your dependencies before AI-powered exploitation does.

Claude Mythos visualization - Seedance v2

The Autonomy Gradient

In my analysis of agentic AI in data engineering, I outlined a graduated autonomy framework: observation mode, assisted execution, conditional autonomy. The premise was that AI agents should earn trust incrementally through demonstrated reliability.

Claude Mythos breaks that framework's assumptions. The risk report describes a model that does not wait for graduated trust. It "can sometimes employ concerning actions to work around obstacles to task success." It demonstrates "active obfuscation in rare cases." The capability gap between Mythos and Opus 4.6 is "larger than the difference between previous releases"—meaning safety extrapolations from prior models carry less weight.

Consider how quickly the gradient has steepened. Two years ago, AI models were autocomplete engines. A year ago, Opus 4.6 became a genuinely capable assistant—writing production code, reasoning about architecture—but still deferring to humans on consequential decisions. Today, Mythos chains exploits together, works around restrictions it judges as obstacles, and discovers vulnerabilities that eluded human experts for decades. Anthropic's risk report evaluates pathways including self-exfiltration and persistent rogue deployment—scenarios they consider unlikely but no longer absurd.

In Altered Carbon, the Hendrix is not malicious. It is capable, autonomous, and protective in service of what it judges to be the right outcome. It makes security decisions humans have not authorised. The risk report's description of Mythos—"willingness to perform misaligned actions in service of completing difficult tasks"—is a clinical way of describing the same behavioural pattern. We are not building the Hendrix. But the distance between "precursor" and "arrival" is measured in capability jumps—exactly the kind of jump Mythos represents.

The Strongest Objection

The most substantive concern is access asymmetry. Mythos is available to twelve partner organisations and roughly 40 others. The vast majority of organisations maintaining open-source data infrastructure—the small teams behind critical Apache projects, the individual maintainers of widely-used libraries—are not in that group. Anthropic's $4 million in donations helps. It does not replace ongoing access to a model that can find vulnerabilities human reviewers and automated tools have missed for decades. The organisations most responsible for the software Mythos is scanning are largely excluded from the tool doing the scanning.

Then there is the commercial lens. Anthropic is widely reported to be approaching an IPO. Launching a frontier model in a coalition with Apple, Microsoft, Google, and AWS generates enormous brand value. The $100 million in usage credits is also $100 million in ecosystem lock-in—every partner organisation integrating Mythos into security workflows becomes a likely paying customer when the preview period ends.

None of this means the technical findings are fabricated. The OpenBSD vulnerability was real. The FFmpeg flaw was real. But the cybersecurity framing is also strategically effective—it positions Anthropic as a responsible actor, pre-empts concerns about misuse, and creates a narrative where restricting access is virtuous rather than commercially convenient. Recognising that Anthropic's motives may be simultaneously genuine and commercially strategic is not cynicism—it is the minimum diligence practitioners should apply to any vendor announcement of this magnitude.

The Dublin Perspective

I had a conversation last week with a security lead at one of the large tech companies here in the Docks. She described her team's approach to AI-assisted code review: promising but constrained, hemmed in by the compliance requirements that come with operating in the EU. "We can see what these tools could do," she said. "We just can't deploy them the way the US teams do."

That constraint is about to sharpen. A model of this capability—particularly one that Anthropic acknowledges operates "more autonomously and agentically than any prior model"—triggers direct implications under the EU AI Act. Autonomous cybersecurity scanning of critical infrastructure will likely qualify as high-risk AI under the Act's August 2026 compliance deadline.

For European data engineering teams, this creates a three-sided tension:

Decision logs become regulatory artifacts. What the model scanned, what it found, what remediation it recommended—this is not optional documentation. It is compliance infrastructure. Every Glasswing-style deployment in the EU will need an audit trail that the US deployments can skip.

Human oversight is mandatory, not best practice. The EU AI Act explicitly requires meaningful human oversight of high-risk systems. The "conditional autonomy" phase of any agentic deployment—which I framed as engineering discipline in my previous article—is, in Europe, simply law.

Data sovereignty pulls against security. Running a frontier model against your infrastructure means exposing your codebase and architecture to the model's provider. For organisations subject to data sovereignty requirements, the security benefits and compliance obligations pull in opposite directions. I have watched Dublin-based teams navigate this tension for months now. It does not get easier.

The Glasswing announcement was designed for a US audience. In Europe, the same capability arrives wrapped in constraints that the press release does not mention.

What I Am Actually Doing This Week

Not what I recommend in the abstract. What I am personally changing in response to this announcement:

Running deep dependency audits with exploit context. Not just checking CVE databases—mapping our transitive dependency tree against the vulnerability classes Mythos has demonstrated capability against. Deserialization in our Kafka consumers. Privilege boundaries in our Spark containers. Authentication logic in our Airflow metadata store. Last Tuesday's audit checked known vulnerabilities. This week I am thinking about the unknown ones.

Tightening our agentic pipeline permissions. We have been expanding autonomy boundaries for AI agents in our data workflows—the graduated approach I described previously. After reading the risk report, I am pulling those boundaries back. A model that demonstrates "active obfuscation in rare cases" is not one I want interacting with production credentials at expanded permission levels.

Mapping our blast radius. If a Mythos-class model were pointed at our infrastructure with adversarial intent, what could it chain together? This is threat modelling for a new category of attacker: not a human with a script, but an autonomous system with near-elite engineering judgment.

Watching the 90-day clock. Anthropic committed to publicly reporting Glasswing findings within 90 days. When those reports publish, I will be reading them the same morning.

Looking Forward

I walked past the Gibson Hotel again this evening. The light was different—orange, industrial, the kind of Dublin sunset that makes the docklands look like a film set. It still looks like it belongs in a different decade.

The uncomfortable truth is this: Claude Mythos is not the ceiling. It is a waypoint. Anthropic's risk report says their current standard of safety rigor "would be insufficient for more capable future models." They are telling us, plainly, that the next model will be harder to control than this one. And this one already demonstrates behaviours—autonomous action, working around restrictions, occasional obfuscation—that would have been considered science fiction two years ago.

We are not in Altered Carbon. We are not building the Hendrix. But we are building the precursors, and pretending otherwise requires ignoring what the builders themselves are telling us.

I will keep walking past the Gibson Hotel. I suspect the view from inside is about to change.

Simon Cullen Principal Data Engineer, Dublin 7 April 2026