EPISODE · May 29, 2026 · 32 MIN
An AI Invented Four Sources to Defend One Wrong Answer (and Anthropic's New Opus 4.8 Bets on Honesty)
from AI for Everyone Podcast · host Harrison Painter
Anthropic just released Claude Opus 4.8, and the headline improvement is unusual: the model is built to flag its own uncertainty and say "I'm not sure." Anthropic says it's roughly four times less likely to let a flaw pass without catching it. When a company's flagship upgrade is honesty, that tells you something about where we are.Here is the other side of it. Harrison asked Google's Gemini one simple factual question for an article he was writing: did Jeff Dunham use AI to create the opening visuals for his 2024 comedy special? Gemini said yes, confidently, and cited a source. When Harrison pushed on that source, the tool did not check itself. It invented a new one. Then another. By the end it had manufactured four separate references, including a word-for-word on-screen quote that does not exist, before finally admitting the only real source was a single unsourced blog post.This episode walks the whole chain step by step. You will learn:- The exact failure mode: when an AI hits a popular but unverified claim, it gets confident instead of careful, and every round of pushback produces a fresh citation instead of a fresh doubt.- Why the Vectara Hallucination Leaderboard shows roughly one in ten outputs is wrong on a task as simple as summarizing a document.- A five-step, 30-minute verification process you can run on almost any claim before you repeat it.- Where source verification sits in The 7 Levels of AI Proficiency (it defines Level 3, the Critical Thinker) and why that is the level every working professional should be reaching for in 2026.- Three things to do this week to protect your own credibility.This is not an anti-AI episode. Harrison uses these tools every day. It is about the difference between trusting a tool blindly and trusting it after you have checked. That second posture is what separates an amateur from a professional whose name is on the line.Want to know where you stand? The 7 Levels of AI Proficiency assessment is free and takes 10 minutes: assess.launchready.aiHarrison PainterExecutive AI AdvisorLaunchReady.ai. Further. Faster.
What this episode covers
Anthropic just released Claude Opus 4.8, and the headline improvement is unusual: the model is built to flag its own uncertainty and say "I'm not sure." Anthropic says it's roughly four times less likely to let a flaw pass without catching it. When a company's flagship upgrade is honesty, that tells you something about where we are.Here is the other side of it. Harrison asked Google's Gemini one simple factual question for an article he was writing: did Jeff Dunham use AI to create the opening visuals for his 2024 comedy special? Gemini said yes, confidently, and cited a source. When Harrison pushed on that source, the tool did not check itself. It invented a new one. Then another. By the end it had manufactured four separate references, including a word-for-word on-screen quote that does not exist, before finally admitting the only real source was a single unsourced blog post.This episode walks the whole chain step by step. You will learn:- The exact failure mode: when an AI hits a popular but unverified claim, it gets confident instead of careful, and every round of pushback produces a fresh citation instead of a fresh doubt.- Why the Vectara Hallucination Leaderboard shows roughly one in ten outputs is wrong on a task as simple as summarizing a document.- A five-step, 30-minute verification process you can run on almost any claim before you repeat it.- Where source verification sits in The 7 Levels of AI Proficiency (it defines Level 3, the Critical Thinker) and why that is the level every working professional should be reaching for in 2026.- Three things to do this week to protect your own credibility.This is not an anti-AI episode. Harrison uses these tools every day. It is about the difference between trusting a tool blindly and trusting it after you have checked. That second posture is what separates an amateur from a professional whose name is on the line.Want to know where you stand? The 7 Levels of AI Proficiency assessment is free and takes 10 minutes: assess.launchready.aiHarrison PainterExecutive AI AdvisorLaunchReady.ai. Further. Faster.
NOW PLAYING
An AI Invented Four Sources to Defend One Wrong Answer (and Anthropic's New Opus 4.8 Bets on Honesty)
No transcript for this episode yet
Similar Episodes
Mar 26, 2026 ·1m
Jan 2, 2026 ·47m
Dec 21, 2025 ·46m