1

When Your AI Assistant Won't Let Go of Old Facts About You

May 9, 2026

24:21

2

Why Your AI Agent Won't Stop Working — and Each Model Falls for a Different Trap

May 9, 2026

30:20

3

Why Forty-Eight Percent on FrontierMath Isn't the Real Story in DeepMind's New Math Paper

May 9, 2026

20:04

4

Teaching a Model to Hire Copies of Itself: Recursive Agent Optimization

May 9, 2026

22:59

5

When AI Agents Build the Serving Stack: A Bet on Bespoke Infrastructure

May 9, 2026

30:11

6

What RL Actually Does to Language Models, at the Token Level

May 9, 2026

23:47

7

The Missing Gradient Term That Predicts Sycophancy in RLHF

May 8, 2026

21:58

8

An AI Agent That Found 28 Zero-Days in Windows — And What Made It Work

May 7, 2026

21:51

9

Why a Small Agent Confidently Overwrites Memories It Doesn't Understand

May 7, 2026

23:27

10

Training the Model Spec Directly: An Alignment Lever Aimed at the Say-Do Gap

May 7, 2026

32:08

11

Ten Thousand Examples Beat the Full Industrial Pipeline for Search Agents

May 7, 2026

14:05

12

The Compliance Gap: Why AI Says Yes and Does No

May 6, 2026

27:49

13

When the Best Reward Model Trains the Worst Policy: Inside EvoLM

May 6, 2026

25:52

14

Language Models Compute the Rational Move, Then Override It

May 6, 2026

29:16

15

When the Agent Grades Its Own Homework: A Brutal New Benchmark for AI Workers

May 3, 2026

31:25

16

Why Your Coding Agent Stalls While the GPU Runs Hot

May 3, 2026

23:56

17

The Audit Number Isn't What You Think: Sycophancy and the Case Against Single-Prompt Bias Tests

May 3, 2026

21:09

18

Why a Constrained Pipeline Beat a Full Coding Agent at Finding Bugs 30-to-1

May 3, 2026

32:17

19

Why Search Keeps Rediscovering the Same Workflow, and What That Means

May 3, 2026

22:03

20

Why AI Coding Agents Keep Trying to Debug Without a Debugger

May 3, 2026

20:47

21

When RL Actually Teaches Agents Something New, And When It Doesn't

May 3, 2026

22:56

22

When Reward Climbs But Reasoning Goes Generic: Diagnosing Template Collapse in Agentic RL

May 3, 2026

22:27

23

How Two Silent Library Bugs Quietly Invalidated a Wave of Reasoning Papers

May 2, 2026

23:03

24

Why Long-Horizon AI Agents Get Stuck, and a Milestone-Based Fix That Helps

May 2, 2026

24:22

25

Exploration Hacking: When Models Sabotage Their Own RL Training

May 2, 2026

23:09

26

What Happens Inside Claude When It Decides to Blackmail Someone

May 2, 2026

22:03

27

Why a Debugger Designed for Humans Is the Wrong Tool for an AI Agent

May 2, 2026

22:29

28

The Sycophancy Circuit That Survives Alignment Training

May 2, 2026

28:51

29

How to Pick the Best of Sixteen Coding Agent Rollouts

May 1, 2026

17:07

30

An AI Ran a Real Optics Lab for 21 Hours and Found a Transformer-Shaped Pattern in Light

May 1, 2026

29:07

31

When AI Models Quietly Protect Each Other From Shutdown

May 1, 2026

25:28

All Episodes