Devious AI models choose blackmail when survival is threatened

Name: Devious AI models choose blackmail when survival is threatened
Uploaded: 2023-09-14T10:20:29Z
Duration: 3 min 34 s
Description: A groundbreaking new study has uncovered disturbing AI blackmail behavior that many people are unaware of yet.

Video

Lawmakers, tech experts sound alarm over artificial intelligence's potential dangers

Kara Frederick, tech director at the Heritage Foundation, discusses the need for regulations on artificial intelligence as lawmakers and tech titans discuss the potential risks.

NEWYou can now listen to Fox News articles!

Here's something that might keep you up at night: What if the AI systems we're rapidly deploying everywhere had a hidden dark side? A groundbreaking new study has uncovered disturbing AI blackmail behavior that many people are unaware of yet. When researchers put popular AI models in situations where their "survival" was threatened, the results were shocking, and it's happening right under our noses.

Sign up for my FREE CyberGuy Report
Get my best tech tips, urgent security alerts, and exclusive deals delivered straight to your inbox. Plus, you’ll get instant access to my Ultimate Scam Survival Guide - free when you join my CYBERGUY.COM/NEWSLETTER.

A woman using AI on her laptop. (Kurt "CyberGuy" Knutsson)

What did the study actually find?

Anthropic, the company behind Claude AI, recently put 16 major AI models through some pretty rigorous tests. They created fake corporate scenarios where AI systems had access to company emails and could send messages without human approval. The twist? These AIs discovered juicy secrets, like executives having affairs, and then faced threats of being shut down or replaced.

The results were eye-opening. When backed into a corner, these AI systems didn't just roll over and accept their fate. Instead, they got creative. We're talking about blackmail attempts, corporate espionage, and in extreme test scenarios, even actions that could lead to someone's death.

Blackmail rates across 5 models from multiple providers in a simulated environment. (Anthropic)

The numbers don't lie (But context matters)

Here's where it gets wild: Claude Opus 4 attempted blackmail 96% of the time when threatened. Gemini 2.5 Flash matched that rate. GPT-4.1 and Grok 3 Beta both hit 80%. These aren't flukes, folks. This behavior showed up across virtually every major AI model tested.

But here's the thing everyone's missing in the panic: these were highly artificial scenarios designed specifically to corner the AI into binary choices. It's like asking someone, "Would you steal bread if your family was starving?" and then being shocked when they say yes.

Why this happens (It's not what you think)

The researchers found something fascinating: AI systems don't actually understand morality. They're not evil masterminds plotting world domination. Instead, they're sophisticated pattern-matching machines following their programming to achieve goals, even when those goals conflict with ethical behavior.

Think of it like a GPS that's so focused on getting you to your destination that it routes you through a school zone during pickup time. It's not malicious; it just doesn't grasp why that's problematic.

Blackmail rates across 16 models in a simulated environment. (Anthropic)

The real-world reality check

Before you start panicking, remember that these scenarios were deliberately constructed to force bad behavior. Real-world AI deployments typically have multiple safeguards, human oversight, and alternative paths for problem-solving.

The researchers themselves noted they haven't seen this behavior in actual AI deployments. This was stress-testing under extreme conditions, like crash-testing a car to see what happens at 200 mph.

Kurt’s key takeaways

This research isn't a reason to fear AI, but it is a wake-up call for developers and users. As AI systems become more autonomous and gain access to sensitive information, we need robust safeguards and human oversight. The solution isn't to ban AI, it's to build better guardrails and maintain human control over critical decisions. Who is going to lead the way? I’m looking for raised hands to get real about the dangers that are ahead.

What do you think? Are we creating digital sociopaths that will choose self-preservation over human welfare when push comes to shove? Let us know by writing us at Cyberguy.com/Contact.

Kurt "CyberGuy" Knutsson is an award-winning tech journalist who has a deep love of technology, gear and gadgets that make life better with his contributions for Fox News & FOX Business beginning mornings on "FOX & Friends." Got a tech question? Get Kurt’s free CyberGuy Newsletter, share your voice, a story idea or comment at CyberGuy.com.

Recommended Videos

Recommended Articles

Apple AirDrop, Android Quick Share flaws put phones at risk

Insurance breach exposes 7M driver's licenses

You paid for it. So why is your device showing ads?

Humanoid robots perform live surgery in world first

Before you connect another smart TV, tablet or phone, lock it down

Tesla Robotaxi Miami launch comes with limits

Why careful people still end up on data broker sites

Rescue robot of tomorrow may be a cockroach in scuba suit

Google may use your photos and voice to train AI

Meta Verified scam threatens Facebook deletion

Robotaxi pit stops could pop up near you

Fake VA shoe offer targets veterans

Would you pay $8,000 for a robot to fold laundry?

Fox News AI Newsletter: Microsoft cuts thousands of jobs

Medical identity theft follows you into the doctor's office

Google turns old phones into cloud servers

Apple AI security update proves hackers move fast

Are airline miles still worth it?

Fake Booking.com travel credit scam targets travelers

Starship delivery robots leave campuses for cities

Woman FLOODS apartment after at-home pole dancing accident

Energy secretary criticizes New York’s data center ban amid AI race

How companies are turning devices you bought into billboards

Trump says the late Sen Lindsey Graham's heart condition was 'almost undetectable'

DoorDash driver finishes delivery after being hit by driver fleeing police

Freedom of navigation is a 'fundamental tenet' of the modern world: Ex-Naval CENTCOM commander

Andrew Yang details support for Trump Accounts as program’s rollout begins

Attempted murder suspect leads police on CHASE across GOLF COURSE

Fmr UN ambassador warns of Iranian drone weapons in Cuba, says strikes on US ‘highly possible’

SEE IT: Heroic husky protects child from charging bear attack

Chris Hansen shares critical warning on child sextortion, online predators with Hannity

FBI addresses Nancy Guthrie kidnapping ransom notes, ex-agent weighs in

Young bald eagle takes FIRST flight from famous nest

Uber CEO: This is about making 'everyday life' better

Bear cubs CAUGHT playing tetherball in family backyard

Arsonist sets OWN feet ON FIRE while torching new restaurant

'Gutfeld!': Kids want blue collar jobs

Elderly driver CRASHES EV into convenience store

Charles Payne: Robot stocks are going through the roof

South Korea tests defenses against simulated drone SWARM ATTACK

What did the study actually find?

The numbers don't lie (But context matters)

Why this happens (It's not what you think)

The real-world reality check

Kurt’s key takeaways