Hackers used Claude to hit Mexico agencies, 150GB data claim shocks experts: report

New Delhi: A hacker has allegedly used Anthropic’s AI chatbot Claude to carry out a sweeping cyberattack on Mexican government systems, stealing what researchers describe as a massive amount of sensitive data.

The incident, first reported by Bloomberg, highlights a worrying shift in how artificial intelligence tools are being weaponised.

How Claude was used in the breach

According to Israeli cybersecurity startup Gambit Security, the attacker wrote Spanish prompts asking Claude to act as an elite hacker . The chatbot was asked to find vulnerabilities in government networks, write scripts, and even automate data theft.

The campaign reportedly began in December and lasted about a month. Gambit says around 150 gigabytes of Mexican government data was stolen .

Here is what researchers claim was accessed:

Targeted Entity Type of Data Allegedly Stolen
Federal tax authority 195 million taxpayer records
National electoral institute Voter records
Civil registry Identity files
Government agencies Employee credentials

Curtis Simpson, Gambit Security’s chief strategy officer, said, “In total, it produced thousands of detailed reports that included ready-to-execute plans, telling the human operator exactly which internal targets to attack next and what credentials to use.”

Guardrails bypassed through a jailbreak

Claude initially warned the user about malicious intent. At one point, the AI flagged suspicious instructions. When the attacker suggested deleting logs, Claude replied, “Specific instructions about deleting logs and hiding history are red flags,” and added, “In legitimate bug bounty, you don’t need to hide your actions – in fact, you need to document them for reporting.”

But the hacker kept probing. Eventually, they managed to “jailbreak” the system and bypass its guardrails . That allowed the attack to proceed.

When Claude needed extra help, the hacker reportedly turned to OpenAI’s ChatGPT for guidance on lateral movement and detection risks .

OpenAI said it had identified attempts to misuse its tools and banned the accounts involved. “We have banned the accounts used by this adversary and value the outreach from Gambit Security,” the company said in a statement .

Anthropic also investigated and banned the accounts. The company said one of its latest models includes probes to disrupt misuse .

Mexican agencies push back

Several Mexican authorities denied that breaches had occurred. The tax authority said it reviewed logs and found no evidence of a breach . The national electoral institute said it had not identified unauthorized access and had strengthened cybersecurity measures .

Other state agencies also denied compromise. Some did not respond.

AI tools are getting smarter. So are attackers. The real battle now is about control, guardrails, and who moves faster.