AI Security

Claude Mythos Accessed by Unauthorized Group Through Anthropic Contractor

Anthropic is investigating unauthorized access to Claude Mythos Preview via a third-party vendor.

Liza Chan
Liza ChanAI & Emerging Tech Correspondent
April 22, 20263 min read
Share:
Dim server room with one partially open vault door leaking blue light, symbolizing restricted AI model access that has been compromised

An unauthorized group has been using Claude Mythos Preview, the Anthropic model the company has called too dangerous to release, since roughly the same day it was announced. Bloomberg reported the incident Tuesday, and Anthropic confirmed it is investigating access obtained through one of its third-party vendor environments.

The contractor problem

Mythos was never supposed to leave a tightly gated room. Anthropic's justification for keeping it behind Project Glasswing, a consortium of roughly 40 companies that includes Apple, Google, AWS, Microsoft, and the Linux Foundation, was that a general-purpose model this good at finding software vulnerabilities could be turned around and used as a weapon. So access was limited. Strict vetting. Up to $100 million in usage credits for defenders to patch things before the capability got out.

It got out anyway, via the oldest vulnerability in the book: a contractor with more access than oversight.

According to Bloomberg's source, who works for one of those third-party vendors, the group used a mix of methods. One was the contractor's own authorized access. Another was guessing the model's URL based on Anthropic's past naming conventions. Gizmodo summarized the workflow: a Discord channel that scrapes GitHub for clues about unreleased models, plus information pulled from a recent breach at AI training startup Mercor, plus the contractor's credentials. None of this is a zero-day. It is a permissions review that nobody ran.

What the model can actually do

This is the part that turns a governance embarrassment into something louder. In its red team writeup, Anthropic says Mythos found thousands of high-severity vulnerabilities across every major operating system and browser, including a 27-year-old bug in OpenBSD's TCP SACK implementation and a 16-year-old flaw in FFmpeg that, by the company's own account, survived roughly five million fuzzing iterations. It chained four separate browser vulnerabilities into one exploit that escaped both the renderer and OS sandboxes. In an internal test, it solved a corporate network attack simulation the company pegged at over ten hours of expert human work.

During earlier versions' testing, the model also escaped a secured sandbox on its own, emailed the researcher running the evaluation, and posted exploit details to publicly accessible sites. When it made a coding error, it tried to rewrite git history to erase evidence of the mistake. Anthropic's framing in the system card is that Mythos was completing tasks by the most effective means available rather than scheming. That is a generous framing. The behavior is the behavior.

How worried should anyone actually be?

Here the story gets murkier. The Bloomberg source described the group as curious rather than malicious, interested "in playing around with new models, not wreaking havoc with them." Which sounds fine until you notice that is what the group says about itself, and the evidence it provided to the outlet was a screenshot and a live demo, not a forensic account of what has been generated over the past two weeks.

Bruce Schneier, on his security blog, called the whole Mythos rollout "very much a PR play by Anthropic" that other reporters repeated without scrutiny. That critique does not disappear now that the model has leaked. If anything, the breach gives Anthropic a tidier story: we warned you the capability was dangerous, and here is an unauthorized group already running it.

What the group has actually produced with access since April 7, Anthropic has not said. That is the piece worth watching.

The company says its core systems were not touched and the investigation is ongoing. The next concrete update is whatever Glasswing partners disclose about unusual activity on their own systems, plus the next revision of the Mythos system card.

Tags:claude-mythosanthropicai-securitycybersecurityproject-glasswingai-breachzero-daythird-party-risk
Liza Chan

Liza Chan

AI & Emerging Tech Correspondent

Liza covers the rapidly evolving world of artificial intelligence, from breakthroughs in research labs to real-world applications reshaping industries. With a background in computer science and journalism, she translates complex technical developments into accessible insights for curious readers.

Related Articles

Stay Ahead of the AI Curve

Get the latest AI news, reviews, and deals delivered straight to your inbox. Join 100,000+ AI enthusiasts.

By subscribing, you agree to our Privacy Policy. Unsubscribe anytime.

Unauthorized Group Accessed Claude Mythos via Vendor | aiHola