Anthropic Accidentally Leaked Their Most Dangerous AI Model Yet, and What It Can Do Is Terrifying
A security researcher found 3,000 internal Anthropic files wide open on the internet, revealing a secret AI model called 'Mythos' that can hack computers better than any human.
Imagine accidentally leaving your diary on a park bench, except your diary contains blueprints for one of the most powerful AI systems ever built. That is basically what just happened to Anthropic, the company behind the Claude AI assistant.
On March 26, a security researcher discovered that a misconfigured data store on Anthropic's servers had exposed nearly 3,000 internal files to anyone who knew where to look. Among those files was a detailed blog post draft describing a new model called Claude Mythos, internally codenamed 'Capybara.'
Here is what makes this story wild: Mythos is described as being dramatically better than anything Anthropic has ever built. It is not just smarter at answering questions or writing code. According to the leaked documents, it is 'currently far ahead of any other AI model in cyber capabilities,' meaning it can find and exploit security vulnerabilities in ways that 'far outpace the efforts of defenders.'
In plain English? This AI is better at hacking than the people trying to stop hackers.
Anthropic did not deny any of it. A spokesperson confirmed they are 'developing a general purpose model with meaningful advances in reasoning, coding, and cybersecurity' and said they are 'being deliberate about how we release it.'
The company plans to roll out access in phases, starting with cybersecurity partners so defenders can prepare before the model goes wide. Think of it like giving the police a head start before releasing a new type of lockpick to the public.
As reported by RenovateQR.
Source: RenovateQR
Sponsored