Radically Better Reasoning: Elicit's Andreas Stuhlmüller & Jungwon Byun on World Models for Research episode artwork

EPISODE · Jun 17, 2026 · 1H 46M

Radically Better Reasoning: Elicit's Andreas Stuhlmüller & Jungwon Byun on World Models for Research

from "The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis · host Erik Torenberg, Nathan Labenz

Andreas Stuhlmüller and Jungwon Byun return to discuss how Elicit is building trusted reasoning workflows for scientific research as frontier models grow more powerful but less transparent. They explain process supervision, domain-specific reasoning primitives, and world models that make evidence, causality, and counterfactuals more inspectable. The conversation also covers life sciences use cases, evaluating conflicting evidence, automated software engineering at Elicit, token costs, Gemini, and why legible reasoning may still beat neuralese. Mercury: Command is Mercury’s new conversational interface, giving you natural-language access to your finances and helping you take actions within your existing permissions and approval policies. Visit https://mercury.com to learn more and apply online in minutes. LINKS: Elicit Research Platform Andreas Stuhlmüller Personal Site Jungwon Byun X Profile Ought Research Organization Elicit Founders Previous Episode GPT-4 Technical Report Monitoring Reasoning Models Paper Ought ICE GitHub Repository Hard-to-Verify Tasks Essay Karpathy LLM Wiki Gist Obsidian Knowledge Base App Mixpanel Analytics Platform Amplitude Analytics Platform Anthropic Tracing Thoughts Research Claude AI Chat Assistant METR Long Tasks Measurement Pi Agent Scaffold Repository Personal AI Infrastructure Repository Elicit Claude Opus Evaluation Elicit API Documentation METR Developer Productivity Study Elicit Planning Is Unsolved Rich Sutton Bitter Lesson Meta Llama AI Models Recursive San Francisco Event zero.xyz Agent Tool Access Anthropic Dynamic Workflows Coverage Sponsor: Claude: Claude by Anthropic is an AI collaborator that understands your workflow and helps you tackle research, writing, coding, and organization with deep context. Get started with Claude and explore Claude Pro at https://claude.ai/tcr

Andreas Stuhlmüller and Jungwon Byun return to discuss how Elicit is building trusted reasoning workflows for scientific research as frontier models grow more powerful but less transparent. They explain process supervision, domain-specific reasoning primitives, and world models that make evidence, causality, and counterfactuals more inspectable. The conversation also covers life sciences use cases, evaluating conflicting evidence, automated software engineering at Elicit, token costs, Gemini, and why legible reasoning may still beat neuralese. Mercury: Command is Mercury’s new conversational interface, giving you natural-language access to your finances and helping you take actions within your existing permissions and approval policies. Visit https://mercury.com to learn more and apply online in minutes. LINKS: Elicit Research Platform Andreas Stuhlmüller Personal Site Jungwon Byun X Profile Ought Research Organization Elicit Founders Previous Episode GPT-4 Technical Report Monitoring Reasoning Models Paper Ought ICE GitHub Repository Hard-to-Verify Tasks Essay Karpathy LLM Wiki Gist Obsidian Knowledge Base App Mixpanel Analytics Platform Amplitude Analytics Platform Anthropic Tracing Thoughts Research Claude AI Chat Assistant METR Long Tasks Measurement Pi Agent Scaffold Repository Personal AI Infrastructure Repository Elicit Claude Opus Evaluation Elicit API Documentation METR Developer Productivity Study Elicit Planning Is Unsolved Rich Sutton Bitter Lesson Meta Llama AI Models Recursive San Francisco Event zero.xyz Agent Tool Access Anthropic Dynamic Workflows Coverage Sponsor: Claude: Claude by Anthropic is an AI collaborator that understands your workflow and helps you tackle research, writing, coding, and organization with deep context. Get started with Claude and explore Claude Pro at https://claude.ai/tcr

NOW PLAYING

Radically Better Reasoning: Elicit's Andreas Stuhlmüller & Jungwon Byun on World Models for Research

0:00 1:46:10

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

Frequently Asked Questions

How long is this episode of "The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis?

This episode is 1 hour and 46 minutes long.

When was this "The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis episode published?

This episode was published on June 17, 2026.

What is this episode about?

Andreas Stuhlmüller and Jungwon Byun return to discuss how Elicit is building trusted reasoning workflows for scientific research as frontier models grow more powerful but less transparent. They explain process supervision, domain-specific reasoning...

Can I download this "The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!