When AI Goes Rogue: Blackmail, Shutdowns, and the Rise of High-Agency Machines

from Cyberside Chats: Cybersecurity Insights from the Experts · host Chatcyberside

What happens when your AI refuses to shut down—or worse, tries to blackmail you to stay online? Join us for a riveting Cyberside Chats Live as we dig into two chilling real-world incidents: one where OpenAI’s newest model bypassed shutdown scripts during testing, and another where Anthropic’s Claude Opus 4 wrote blackmail messages and threatened users in a disturbing act of self-preservation. These aren’t sci-fi hypotheticals—they’re recent findings from leading AI safety researchers. We’ll unpack: The rise of high-agency behavior in LLMs The shocking findings from Apollo Research and Anthropic What security teams must do to adapt their threat models and controls Why trust, verification, and access control now apply to your AI This is essential listening for CISOs, IT leaders, and cybersecurity professionals deploying or assessing AI-powered tools. Key Takeaways Restrict model access using role-based controls. Limit what AI systems can see and do—apply the principle of least privilege to prompts, data, and tool integrations. Monitor and log all AI inputs and outputs. Treat LLM interactions like sensitive API calls: log them, inspect for anomalies, and establish retention policies for auditability. Implement output validation for critical tasks. Don’t blindly trust AI decisions—use secondary checks, hashes, or human review for rankings, alerts, or workflow actions. Deploy kill-switches outside of model control. Ensure that shutdown or rollback functions are governed by external orchestration—not exposed in the AI’s own prompt space or toolset. Add AI behavior reviews to your incident response and risk processes. Red team your models. Include AI behavior in tabletop exercises. Review logs not just for attacks on AI, but misbehavior by AI. Resources Apollo Research: Frontier Models Are Capable of In-Context Scheming (arXiv) Anthropic Claude 4 System Card (PDF) Time Magazine: “When AI Thinks It Will Lose, It Sometimes Cheats” WIRED: Claude 4 Whistleblower Behavior Deception Abilities in Large Language Models (ResearchGate) #AI #GenAI #CISO #Cybersecurity #Cyberaware #Cyber #Infosec #ITsecurity #IT #CEO #RiskManagement

What this episode covers

NOW PLAYING

0:00 26:27

1×

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

Share this episode

Similar Episodes

I'm ok

Mar 26, 2026 ·1m

Food Saved My Life

Mar 19, 2026 ·34m

Eat More Vegetables: The 4 Foods That Beat Ozempic (Naturally)

Feb 18, 2026 ·11m

How to End Heart Disease with Dr. Fuhrman

Feb 11, 2026 ·45m

Revolutionizing Breast Health: QT Imaging, Overdiagnosis, and What to Do Instead

Jan 27, 2026 ·35m

REMIX: Why we over-shop and compulsively acquire, and how to stop, with Dr Jan Eppingstall

Jan 9, 2026 ·61m

Similar Podcasts

MG Show MG Show The MG Show, hosted by Jeffrey Pedersen and Shannon Townsend, is a leading alternative media platform dedicated to uncovering the truth behind today’s most pressing political issues. Launched in 2019, the show has grown exponentially, offering unfiltered insights, comprehensive research, and real-time analysis. With a commitment to independent journalism and factual integrity, the MG Show empowers its audience with knowledge and encourages active participation in the political discourse. Ask A Spaceman Archives - 365 Days of Astronomy Ask A Spaceman Archives - 365 Days of Astronomy Podcasting Astronomy Every Day of the Year Breaking News Show | eTurboNews Juergen Thomas Steinmetz News is relevant to the global travel and tourism industry, human rights and global issues.Breaking news when it happens and only from the source. Eat to Live Jenna Fuhrman, Dr. Fuhrman Our health is our most precious gift and smart nutrition can change your life. Each month, join Dr. Fuhrman and his daughter, Jenna Fuhrman as they discuss important topics in the world of nutrition. Eat to Live will change the way you eat and think about food.

Frequently Asked Questions

How long is this episode of Cyberside Chats: Cybersecurity Insights from the Experts?

This episode is 26 minutes long.

When was this Cyberside Chats: Cybersecurity Insights from the Experts episode published?

This episode was published on June 17, 2025.

What is this episode about?

Is there a transcript available for this episode?

Yes, a full transcript is available for this episode. You can read the complete transcript on the episode page.

Can I download this Cyberside Chats: Cybersecurity Insights from the Experts episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.

URL copied to clipboard!