All Episodes
AI Papers: A Deep Dive — 31 episodes
When Your AI Assistant Won't Let Go of Old Facts About You
Why Your AI Agent Won't Stop Working — and Each Model Falls for a Different Trap
Why Forty-Eight Percent on FrontierMath Isn't the Real Story in DeepMind's New Math Paper
Teaching a Model to Hire Copies of Itself: Recursive Agent Optimization
When AI Agents Build the Serving Stack: A Bet on Bespoke Infrastructure
What RL Actually Does to Language Models, at the Token Level
The Missing Gradient Term That Predicts Sycophancy in RLHF
An AI Agent That Found 28 Zero-Days in Windows — And What Made It Work
Why a Small Agent Confidently Overwrites Memories It Doesn't Understand
Training the Model Spec Directly: An Alignment Lever Aimed at the Say-Do Gap
Ten Thousand Examples Beat the Full Industrial Pipeline for Search Agents
The Compliance Gap: Why AI Says Yes and Does No
When the Best Reward Model Trains the Worst Policy: Inside EvoLM
Language Models Compute the Rational Move, Then Override It
When the Agent Grades Its Own Homework: A Brutal New Benchmark for AI Workers
Why Your Coding Agent Stalls While the GPU Runs Hot
The Audit Number Isn't What You Think: Sycophancy and the Case Against Single-Prompt Bias Tests
Why a Constrained Pipeline Beat a Full Coding Agent at Finding Bugs 30-to-1
Why Search Keeps Rediscovering the Same Workflow, and What That Means
Why AI Coding Agents Keep Trying to Debug Without a Debugger
When RL Actually Teaches Agents Something New, And When It Doesn't
When Reward Climbs But Reasoning Goes Generic: Diagnosing Template Collapse in Agentic RL
How Two Silent Library Bugs Quietly Invalidated a Wave of Reasoning Papers
Why Long-Horizon AI Agents Get Stuck, and a Milestone-Based Fix That Helps
Exploration Hacking: When Models Sabotage Their Own RL Training
What Happens Inside Claude When It Decides to Blackmail Someone
Why a Debugger Designed for Humans Is the Wrong Tool for an AI Agent
The Sycophancy Circuit That Survives Alignment Training
How to Pick the Best of Sixteen Coding Agent Rollouts
An AI Ran a Real Optics Lab for 21 Hours and Found a Transformer-Shaped Pattern in Light
When AI Models Quietly Protect Each Other From Shutdown