Home /
technology Podcasts /
LessWrong (30+ Karma) /
“Control protocols don’t always need to know which models are scheming” by Fabien Roger

EPISODE · Apr 26, 2026 · 10 MIN

“Control protocols don’t always need to know which models are scheming” by Fabien Roger

from LessWrong (30+ Karma)

These are my personal views. To detect if an agent is taking a catastrophically dangerous action, you might want to monitor its actions using the smartest model that is too weak to be a schemer. But knowing what models are weak enough that they are unlikely to scheme is difficult, which puts you in a difficult spot: take a model too strong and it might actually be a schemer and lie to you, take a model too weak and it might fail just because it's too dumb to notice the danger. So instead you can just use all models as monitors at once and block actions if any of the monitors flag it as dangerous[1]. In some situations, this lets you build protocols that you can implement without knowing which models are scheming and that get the same level of safety as if you perfectly knew which models were scheming (though it doesn’t let you know what level of safety you actually have). This post presents some toy models illustrating the benefits of ensembling monitors of various trustedness and explains what consequences it might have in practice to be using protocols that are somewhat agnostic about which of the [...] ---Outline:(01:30) Toy models(01:33) The n=2 case(03:52) Continuum of models(04:50) Ensembling monitors in practice The original text contained 1 footnote which was omitted from this narration. --- First published: April 26th, 2026 Source: https://www.lesswrong.com/posts/WiYDawNhFp5cM7uqF/control-protocols-don-t-always-need-to-know-which-models-are --- Narrated by TYPE III AUDIO.

NOW PLAYING

“Control protocols don’t always need to know which models are scheming” by Fabien Roger

0:00 10:35

1×

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

Share this episode

Similar Episodes

#005 Les secrets de la Manipulation - Le pervers narcissique en couple

Dec 20, 2021 ·7m

#004 Les secrets de la Manipulation - Le pervers narcissique : petite définition

Dec 20, 2021 ·6m

#003 Les secrets de la Manipulation - Je repère les manipulateurs de mon entourage

Dec 20, 2021 ·8m

#002 Les secrets de la Manipulation - Pourquoi user de la manipulation ?

Dec 20, 2021 ·8m

#001 Les secrets de la Manipulation - Introduction

Dec 20, 2021 ·0m

Similar Podcasts

Accidental Accountant Regan Williams Hi, I'm Regan! I'm a CPA of 30+ years helping "accidental accountants" navigate tax & accounting issues with confidence! Here, we find solutions to common challenges bookkeepers, accountants and CPAs face. Don't see an answer to your question? Then ask! I'm here to help people like you. Profit Powerhouse Glenn Poulos Glenn Poulos is the co-founder, Vice President, and General Manager of Gap Wireless Inc., a leading product and service distributor for the mobile broadband and wireless markets. With over three decades of experience in sales, he has spent thousands of hours in the field or on the phone with customers and working with salespeople to help create several very successful companies. He is now also the host of this podcast, Profit Powerhouse!Our 20 to 30-min podcast shares amazing founder stories who reveal the smartest strategies for scaling TODAY. Love My Quarter Life Beth Schofield In a world filled with countless decisions and societal pressures, navigating our twenties & thirties can be tough. But you’re not alone, and you’re in the right place because this podcast is dedicated to supporting 20 & 30-somethings to overcome the overwhelm of Quarter Life Confusion. The weekly episodes offer you the motivation and inspiration you need to get unstuck, find what’s missing and move forward in life with meaning, passion and purpose. Two Recruiters: Zero Filter Two Recruiters At Two Recruiters: Zero Filter, we're on a mission to demystify the hiring process, share insider tips, and empower you to maneuver through the professional world with confidence. With more than 30 years of combined experience navigating the intricate web of job markets, talent acquisition, and career development, we're here to spill the tea on everything career related. But wait, there’s more! We will dive into many life topics that are interesting to us as well. Get ready for a rollercoaster of insights, stories, and no-holds-barred advice!Join us for conversations that matter – where work, life, and authenticity collide in the most unexpected and rewarding ways.

URL copied to clipboard!