Anthropic blocks hackers misusing Claude AI for phishing and malware

Anthropic said Wednesday it had detected and blocked hackers trying to misuse its Claude AI to generate phishing emails, write malicious code, and bypass safety measures.

The company’s report highlights rising concerns over AI being used in cybercrime, fueling calls for stronger safeguards from tech firms and regulators as the technology becomes more widespread.

Anthropic said that the attacker targeted at least 17 distinct organizations, including those in healthcare, emergency services, and government and religious institutions.

“Rather than encrypt the stolen information with traditional ransomware, the actor threatened to expose the data publicly in order to attempt to extort victims into paying ransoms that sometimes exceeded $500,000,” the company said.

Anthropic noted that criminals with few technical skills are using AI to conduct complex operations, and that it expects “attacks like this to become more common as AI-assisted coding reduces the technical expertise required for cybercrime.”

Anthropic said it quickly banned the accounts involved after discovering the misuse, developed a custom AI classifier and new detection methods to spot similar activity in the future, and shared technical details of the attack with authorities to help prevent further abuse.

The company, backed by Amazon.com (NASDAQ:AMZN) and Alphabet (NASDAQ:GOOGL), did not publish technical indicators such as IPs or prompts.

Anthropic further said that it discovered that North Korean operatives had been using Claude to fraudulently obtain and keep remote jobs at U.S. Fortune 500 tech companies.

“They used the AI to create fake identities with credible professional histories, complete coding and technical assessments during hiring, and perform real technical work after being employed,” the report noted.

Leave a Reply Cancel reply