EPISODE · Jan 1, 2026 · 29 MIN
This Deep Research Agent Ignored the Benchmark and Still Won
from YAAP (Yet Another AI Podcast) · host AI21
Tavily built a Deep Research Agent with production in mind. Something they could actually scale. So they did the unsexy work. They went through millions of agent logs, found where tokens were being wasted, and optimized each section of the system. The result surprised them: they cut token consumption by more than half (!), then tested quality and discovered they topped the DeepResearch Bench without even trying. In this YAAP episode, Yuval sits down with Dean from Tavily to break down how they built it, what they did differently from the usual top approaches, and which design choices made better results possible with far fewer tokens. What you’ll learn: How to reduce token burn without tanking quality Why reading millions of logs beats chasing the flashiest tech The design choices that pushed quality up while tokens dropped hard
NOW PLAYING
This Deep Research Agent Ignored the Benchmark and Still Won
No transcript for this episode yet
Similar Episodes
Mar 26, 2026 ·1m
Jan 2, 2026 ·47m
Dec 21, 2025 ·46m