Anthropic’s Claude Mythos leak reveals powerful AI with cyber attack risks

New Delhi: A fresh leak has pulled back the curtain on a powerful new AI model from Anthropic, raising serious questions around cybersecurity risks. The model, reportedly called Claude Mythos, is still in testing but early details suggest it could be the company’s most advanced system yet.

What makes this story more interesting is how the information came out. The details were not officially announced. Instead, they appeared in a publicly accessible data cache, exposing draft documents and internal material before the company was ready. Fortune first reported the development.

Leaked model hints at major AI leap

Anthropic confirmed it is working on a new model and testing it with select users. A spokesperson said, “We’re developing a general-purpose model with meaningful advances in reasoning, coding, and cybersecurity.”

The company added, “We consider this model a step change and the most capable we’ve built to date.”

The leaked draft blog referred to the model as “Claude Mythos” and also mentioned a new tier called “Capybara.” According to the document, this tier is even more powerful than the current Opus models.

It stated, “Compared to our previous best model, Claude Opus 4.6, Capybara gets dramatically higher scores on tests of software coding, academic reasoning, and cybersecurity.”

Cybersecurity risks raise concerns

The bigger issue here is not just performance. It is what the model could do in the wrong hands. The leaked content clearly flags risks.

The document said, “we want to understand the model’s potential near-term risks in the realm of cybersecurity.”

It went further, warning that the system is “currently far ahead of any other AI model in cyber capabilities” and could lead to attacks that outpace defenders.

Cautious rollout

Anthropic is taking a slow approach to release. The model is being tested with a small group of early access customers. The company said it wants to study risks before a wider rollout.

The leak itself happened due to what the company called “human error” in its content system. Draft blog posts and assets were left accessible in a public data store before being taken down.

The documents also showed that the company plans to give early access to organisations so they can prepare for “the impending wave of AI-driven exploits.”