Max Agency Podcast - All Episodes

4

How Ramp built an AI agent that can think outside of tokens | Alex Shevchenko

Alexander Shevchenko is the head of applied research at Ramp, where he leads Ramp Labs – the team behind Ramp Sheets and a steady stream of public AI engineering experiments. Ramp Sheets started as an internal process mining tool that turned Loom videos of accountants into Markov diagrams, before evolving into the agentic spreadsheet editor that shipped in November. In this conversation, Alex walks through the architecture under the hood, why Ramp biases the agent toward Excel formulas over Python code gen, and two recent Labs experiments: Latent Briefing and a user-steerable revival of Golden Gate Claude.We also discuss:Under the hood of Ramp SheetsInspect, Ramp's internal coding agent, and the self-improving monitor loop it powersWhy finance professionals rejected code gen as too "black box"Why Anthropic models tend to excel at agentic spreadsheet manipulationThe case for putting the agent outside the sandbox, not inside itThe Loom-to-Markov-diagram process mining pipelineRLMs and how subagents can share memory in latent spaceLatent Briefing and KV-cache communication between subagentsReviving Golden Gate Claude with steering vectors on GemmaReferenced:Alex LevinsonAnthropicBen GeistClaudeEfficient Memory Sharing for Multi-Agent Systems via KV Cache Compaction (Ben Geist)GemmaGolden Gate ClaudeGraphvizInspectLatent BriefingLoomModalOpenAIOpusQwenRampRamp LabsRamp SheetsRecursive Language Models (Alex Zhang)RetoolSelf-maintaining Ramp SheetsSteer AIWhere to find Alex:LinkedInTwitter/XWebsiteWhere to find Harrison:LinkedInTwitter/XWhere to find LangChain:WebsiteDocsSend feedback or questions to [email protected]:00:00 Introduction01:13 The origin of Ramp Sheets02:27 The Loom-to-Markov-diagram process mining pipeline04:28 Why code gen approaches felt too "black box" to finance06:13 Meeting finance where they already are: inside the spreadsheet09:08 How far process mining got them10:31 Text descriptions and Graphviz DAGs as output12:41 Under the hood of Ramp Sheets14:52 Why the agent uses Python only as an escape hatch15:47 Why Anthropic models excel at agentic spreadsheet manipulation17:12 Frankensteining the OpenAI Agents SDK17:43 The Ramp Sheets UX and fast vs. expert mode19:58 Agent in a sandbox vs. agent with a sandbox21:55 Vibe evals with expert humans23:40 Inspect, the internal coding agent24:13 The self-monitoring loop and auto-PRs28:01 Other wacky experiments on Sheets28:43 Memory experiments that didn't pan out31:16 Latent Briefing and KV-cache subagent communication35:13 Reviving Golden Gate Claude37:47 Contrastive pairs and steering vectors39:47 Picking the right layers in Gemma41:37 What Ramp Labs looks for when hiring

May 7, 2026

44m
3

How Listen is building a system of AI Agents & subagents for specialized tasks | Florian Juengermann, CTO

Florian Juengermann is the co-founder and CTO of Listen, an AI startup that turns qualitative research across hundreds of interviews, surveys, and focus groups into structured, traceable insights. Listen's agents analyze responses at scale, and Florian has rearchitected the system multiple times to get there. In this conversation, he walks through the virtual table architecture at the core of their Research Agent, how small models run map-reduce classification across thousands of open-ended responses, and the self-reviewing feedback subagent that catches errors during long async runs.We also discuss:The three agents inside Listen's platformHow Listen rearchitected from a simple RAG bot to a multi-agent system multiple timesWhy the PowerPoint subagent was completely rebuilt using Claude's code SDKContextual prompt engineering as an alternative to skillsHow Listen keeps report numbers live as new interview responses come inWhen to trigger the long-running agent vs. showing early resultsWhat Florian looks for when hiring agent engineersReferences:AnthropicChatGPTClaudeClaude Code SDKE2BEmotional IntelligenceGPT MiniHaikuListenOpenAIPandasPostgresPythonResearch AgentRenderZoomWhere to find Florian:LinkedInTwitter/XWhere to find Harrison:LinkedInTwitter/XWhere to find LangChain:WebsiteDocsSend feedback or questions to [email protected]:00 Introduction01:25 The three agents inside Listen's platform03:15 Live chat vs. long async runs, and how Listen tunes for each05:33 Under the hood of the Research Agent06:37 Listen's virtual table architecture07:34 How small models classify thousands of open-ended responses10:05 Running code in a sandbox: how E2B fits in11:52 Why Listen rebuilt the PowerPoint subagent from scratch14:11 Contextual prompt engineering instead of skills16:32 The feedback subagent that reviews its own reports18:14 How Listen runs evals in production19:47 Unexpected ways users push the agent to its limits21:42 How many times Listen has rearchitected, and why24:59 Trace observability: depth over breadth26:10 Lessons from running Claude Code SDK inside E2B27:42 Memory: what's solved and what isn't29:10 The Composer agent UX: co-editing a document with AI35:50 How Listen keeps report numbers live as new responses come in43:47 What Listen looks for when hiring agent engineers

Apr 23, 2026

47m
2

How Hex builds AI agents that reason like human data analysts | Izzy Miller, AI Engineer

Izzy Miller is an AI engineer at Hex, an AI analytics platform that was one of the first companies to ship data agents to real paying users. Today, Hex runs a multi-agent system with nearly 100K tokens of tools, and Izzy is building a 90-day simulation to evaluate whether those agents actually get smarter over time. In this conversation, he walks through the harness decisions that shaped their architecture, the failure modes Hex is seeing at scale, and what it takes to build an eval that no current model can pass.We also discuss:Why data agents are harder to verify than coding agentsUnder the hood of Hex’s agentsHow Hex is unifying separate agentsWhy most eval sets are badThe 90-day simulation for long-horizon evalsHow Izzy went from marketing to AI engineerReferences:Andon LabsAnthropicBarry McCardelChatGPTClaude CodeClaude Sonnet 4.6DBTGPT-3.5 TurboGPT-5.3 Codex SparkGPT-5.4HexLangChainLangSmithLookerOpenAIOpus 4.6Satya NadellaSnowflakeVending MachineWhere to find Izzy:LinkedInTwitter/XWhere to find Harrison:LinkedInTwitter/XWhere to find LangChain:WebsiteDocsSend feedback or questions to [email protected]:01:35 Where Hex's notebook agent started03:46 The moment Hex knew it was time for agents07:36 Why data agents are harder to verify than coding agents09:30 How Hex is unifying separate agents13:28 Under the hood of the notebook agent15:41 The harness features that are now holding the agent back17:41 Why Hex built their own orchestrator18:59 Managing nearly 100K tokens of tools20:49 Ephemeral queries and agent behavior trade-offs24:46 The UX problem with showing agents' thinking27:28 Why verification is harder than transparency for data agents31:00 Memory, context conflicts, and collapse modes34:38 How Hex built their internal eval system39:29 Why most eval sets are bad44:30 The 900% quota eval that every model fails46:55 Model upgrades and the "in distribution" debate51:34 How Izzy went from marketer to AI engineer59:59 The 90-day simulation for long-horizon evals

Apr 9, 2026

1h 08m
1

Welcome to Max Agency

Welcome to Max Agency, the podcast that goes deep into how the best agents are being built by builders like you. I'm Harrison Chase, CEO of LangChain, the agent engineering company, and I'll be your host.

Apr 8, 2026

0m

Type above to search every episode's transcript for a word or phrase. Matches are scoped to this podcast.

Searching…

No matches for "" in this podcast's transcripts.

Showing of matches

No topics indexed yet for this podcast.

Loading reviews...

Share your thoughts

ABOUT THIS SHOW

Welcome to Max Agency, a podcast about how the best AI agents are actually being built. Hosted by Harrison Chase, CEO of LangChain, each episode goes deep with the builders designing, deploying, and learning from real agent systems in the wild. From architecture decisions to evals, tooling, and failure modes, Max Agency is for people who want to understand what it really takes to build useful agents.

HOSTED BY

LangChain

How Ramp built an AI agent that can think outside of tokens | Alex Shevchenko

How Listen is building a system of AI Agents & subagents for specialized tasks | Florian Juengermann, CTO

How Hex builds AI agents that reason like human data analysts | Izzy Miller, AI Engineer

Welcome to Max Agency

Authentication Required