EPISODE · Jun 27, 2026 · 17 MIN
“Deployment Awareness Matters More Than Evaluation Awareness” by VojtaKovarik, Tomáš Gavenčiak, Mateusz Bagiński
TL;DR Evaluation awareness — an AI recognizing it's being evaluated — is a widely discussed concept in AI safety. But there is a closely related concept that we claim is more important: deployment awareness, the AI's ability to recognize when it is not being evaluated and when its actions matter. A misaligned AI with deployment awareness can game evaluations without any evaluation awareness at all, with a simple strategy: act aligned by default, and deviate only when confident you're in real deployment and your actions matter for your goals. This requires two ingredients — occasionally recognizable deployment situations, and enough self-reflective and strategic reasoning for the AI to anticipate and plan around this. We think "deployment awareness" better identifies what makes evaluations fragile, and we develop this idea below. Concept Explanation Comments Evaluation awareness AI is being tested and confidently believes that this is so This only becomes a problem if most evaluations trigger evaluation awareness, and if the AI knows that. Or if the AI has good self-locating reasoning. Deployment awareness AI is not being tested and confidently believes it is not being tested This is a problem even if it happens rarely (if some of those rare [...] ---Outline:(00:13) TL;DR(01:20) Side note: it's really about consequences, not about evaluation vs. deployment(03:23) Evaluation awareness, deployment awareness, and self-locating beliefs(04:54) Evaluation awareness is less dangerous than it seems(06:58) Deployment awareness is more dangerous than it seems(09:29) Evaluation gaming with no evaluation or deployment awareness(12:35) Final comments(13:33) Appendix: A formal (toy) model The original text contained 13 footnotes which were omitted from this narration. --- First published: June 26th, 2026 Source: https://www.lesswrong.com/posts/XP794SHDuXYfWLrvJ/deployment-awareness-matters-more-than-evaluation-awareness --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
NOW PLAYING
“Deployment Awareness Matters More Than Evaluation Awareness” by VojtaKovarik, Tomáš Gavenčiak, Mateusz Bagiński
No transcript for this episode yet
Similar Episodes
Dec 20, 2021 ·0m