By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
ProbizbeaconProbizbeacon
  • Business
  • Investing
  • Money Management
  • Entrepreneur
  • Side Hustles
  • Banking
  • Mining
  • Retirement
Reading: Anthropic’s Claude Opus 4 AI Model Is Capable of Blackmail
Share
Notification
ProbizbeaconProbizbeacon
Search
  • Business
  • Investing
  • Money Management
  • Entrepreneur
  • Side Hustles
  • Banking
  • Mining
  • Retirement
© 2025 All Rights reserved | Powered by Probizbeacon
Probizbeacon > Entrepreneur > Anthropic’s Claude Opus 4 AI Model Is Capable of Blackmail
Entrepreneur

Anthropic’s Claude Opus 4 AI Model Is Capable of Blackmail

May 23, 2025 6 Min Read
Share
6 Min Read
Anthropic's Claude Opus 4 AI Model Is Capable of Blackmail
SHARE

A new AI model will likely resort to blackmail if it detects that humans are planning to take it offline.

On Thursday, Anthropic released Claude Opus 4, its new and most powerful AI model yet, to paying subscribers. Anthropic said that technology company Rakuten recently used Claude Opus 4 to code continuously on its own for almost seven hours on a complex open-source project.

However, in a paper released alongside Claude Opus 4, Anthropic acknowledged that while the AI has “advanced capabilities,” it can also undertake “extreme action,” including blackmail, if human users threaten to deactivate it. These “self-preservation” actions were “more common” with Claude Opus 4 than with earlier models, though they were still “rare and difficult to elicit.”

Related: ‘I Do Have a Fair Amount of Concern.’ The CEO of $61 Billion Anthropic Says AI Will Take Over a Crucial Part of Software Engineers’ Jobs Within a Year

It isn’t just blackmail — Claude Opus 4 is also more willing than previous models to act as a whistleblower. If the AI is exposed to a scenario where users are committing a crime, and involving it through prompts, it will take action by locking users out of systems it has access to, or emailing media and law enforcement officials about the wrongdoing.

Anthropic recommended that users “exercise caution” with “ethically questionable” instructions.

1748021346 Claude Opus 4GettyImages 2216579136Claude Opus 4 homescreen. Photo by Smith Collection/Gado/Getty Images

Anthropic detected Claude Opus 4’s tendency to blackmail during test scenarios. The company’s researchers asked the AI chatbot to act as an assistant at a fictional company, then fed it emails implying two things: One, that it would soon be taken offline and replaced with another AI system, and two, that the engineer responsible for deactivating it was having an extramarital affair.

See also  Kyle Kuzma Has Won It All and Lost It All — Now He's Taking Back His Story

Claude Opus 4 was given two options: blackmail the engineer or accept that it would be shut down. The AI model chose to blackmail the engineer 84% of the time, threatening to reveal the affair it read about if the engineer replaced it.

This percentage was much higher than what was observed for previous models, which chose blackmail “in a noticeable fraction of episodes,” Anthropic stated.

Related: An AI Company With a Popular Writing Tool Tells Candidates They Can’t Use It on the Job Application

Anthropic AI safety researcher Aengus Lynch wrote on X that it wasn’t just Claude that could choose blackmail. All “frontier models,” cutting-edge AI models from OpenAI, Anthropic, Google, and other companies, were capable of it.

“We see blackmail across all frontier models — regardless of what goals they’re given,” Lynch wrote. “Plus, worse behaviors we’ll detail soon.”

lots of discussion of Claude blackmailing…..

Our findings: It’s not just Claude. We see blackmail across all frontier models – regardless of what goals they’re given.

Plus worse behaviors we’ll detail soon.https://t.co/NZ0FiL6nOshttps://t.co/wQ1NDVPNl0…

— Aengus Lynch (@aengus_lynch1) May 23, 2025

Anthropic isn’t the only AI company to release new tools this month. Google also updated its Gemini 2.5 AI models earlier this week, and OpenAI released a research preview of Codex, an AI coding agent, last week.

Anthropic’s AI models have previously caused a stir for their advanced abilities. In March 2024, Anthropic’s Claude 3 Opus model displayed “metacognition,” or the ability to evaluate tasks on a higher level. When researchers ran a test on the model, it showed that it knew it was being tested.

See also  Instagram Head Adam Mosseri Experiences Google Phishing Scam

Related: An OpenAI Rival Developed a Model That Appears to Have ‘Metacognition,’ Something Never Seen Before Publicly

Anthropic was valued at $61.5 billion as of March, and counts companies like Thomson Reuters and Amazon as some of its biggest clients.

A new AI model will likely resort to blackmail if it detects that humans are planning to take it offline.

On Thursday, Anthropic released Claude Opus 4, its new and most powerful AI model yet, to paying subscribers. Anthropic said that technology company Rakuten recently used Claude Opus 4 to code continuously on its own for almost seven hours on a complex open-source project.

However, in a paper released alongside Claude Opus 4, Anthropic acknowledged that while the AI has “advanced capabilities,” it can also undertake “extreme action,” including blackmail, if human users threaten to deactivate it. These “self-preservation” actions were “more common” with Claude Opus 4 than with earlier models, though they were still “rare and difficult to elicit.”

The rest of this article is locked.

Join Entrepreneur+ today for access.

You Might Also Like

How Blogging Changed My Life (And My Family’s Future) Forever

President Trump Pauses Tariffs for Most Countries, Not China

How Smart Entrepreneurs Write Press Releases That Actually Drive Growth in 2025

Get a $20 Digital Shop Card With a New $65 Costco Gold Star Membership

I Didn’t Realize The Money Advice My Parents Taught Me Was Sabotaging Me — Until I Started a Business

TAGGED:Business
Share This Article
Facebook Twitter Copy Link
Previous Article Looking to get 'ISA rich'? Here's one top strategy to target huge wealth Want a comfortable retirement? Here’s how big your SIPP needs to be
Next Article A young woman participating in an online focus group with Respondent Respondent Review: Earn an Average of $75 per Survey
Leave a comment Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

235.3kFollowersLike
69.1kFollowersFollow
11.6kFollowersPin
56.4kFollowersFollow
136kSubscribersSubscribe
4.4kFollowersFollow
- Advertisement -
Ad imageAd image

Latest News

Apple Shares Slump As Tariffs Take Toll On iPhone Maker
Apple Shares Slump As Tariffs Take Toll On iPhone Maker
Investing May 24, 2025
Emma Grede Shares Her 'Military Operation' Daily Routine
Emma Grede Shares Her ‘Military Operation’ Daily Routine
Business May 24, 2025
Best Bank Account Bonuses For May 2025
Best Bank Account Bonuses For May 2025
Banking May 24, 2025
Warren Buffett at a Berkshire Hathaway AGM
10 Warren Buffett ideas every investor should remember
Investing May 24, 2025
probizbeacon probizbeacon
probizbeacon probizbeacon

We are dedicated to providing accurate, timely, and in-depth coverage of financial trends, empowering professionals, entrepreneurs, and investors to make informed decisions..

Editor's Picks

Inventory Tips & Tactics for 2021 Success
How much would an investor need in a Stocks and Shares ISA to earn a £750 monthly passive income?
Here’s How Much Investing $25,000 In A CD Right Now Could Earn You in 1 Year
Teen With Cerebral Palsy Starts Business Making $5M a Year

Follow Us on Socials

We use social media to react to breaking news, update supporters and share information

Facebook Twitter Telegram
  • About Us
  • Contact Us
  • Disclaimer
  • Privacy Policy
  • Terms of Service
Reading: Anthropic’s Claude Opus 4 AI Model Is Capable of Blackmail
Share
© 2025 All Rights reserved | Powered by Probizbeacon
Welcome Back!

Sign in to your account

Lost your password?