239: Can AI Copilots Keep Up with Pathologists? episode artwork

EPISODE · Jun 3, 2026 · 33 MIN

239: Can AI Copilots Keep Up with Pathologists?

from Digital Pathology Podcast · host Aleksandra Zuraw, DVM, PhD

Send us Fan MailCan AI copilots really keep up with pathologists when the cases are new, the workflow is messy, and the benchmark is actually protected from leakage?In this episode of DigiPath Digest #48, I focus on one paper: DALPHIN: Benchmarking Digital Pathology AI Copilots Against Pathologists on an Open Multicentric Dataset. I chose this paper because I think the field needs more of this kind of work. Less hype. More evaluation. Less “look what AI can do.” More “how do we test it in a way that actually means something?” In this session, I look at what makes DALPHIN important for pathologists, lab leaders, and digital pathology trailblazers trying to make sense of pathology AI right now. The paper benchmarks three models against human pathologists: two general-purpose models, Gemini 2.5 Pro and GPT-5, and one pathology-specific model, PathChat+. The dataset includes 1,236 images from 300 cases, covering 130 diagnoses, 14 pathology subspecialties, and cases from six countries. Human performance is benchmarked with 31 pathologists from 10 countries. What I like about this paper is that it does not stop at top-line performance. It deals with the benchmarking problem itself. The authors built a sequestered, indirectly accessible ground truth so the evaluation data could not simply be scraped into model training. That matters because without that protection, benchmarking can become an illusion of genius rather than a real test of generalization. The results are interesting and more nuanced than a simple win-or-lose story. PathChat+ reached expert-level performance in four of six tasks, Gemini in two of six, and GPT in one of six. That tells us something important already: pathology-specific training matters. But it also does not mean pathology is solved. In organ recognition, expert pathologists still outperformed all the models. In rare cancers, none of the models reached expert-level performance. And in ambiguous cases, the models still struggled with something human pathologists do all the time: expressing uncertainty. I also spend time on one of the most practical parts of the paper: model behavior. Gemini tended to overcall. GPT tended to undercall. PathChat was more balanced. That matters in practice. A pathologist using a copilot needs to know the tool’s calibration bias before they can safely interpret what it is telling them. I also talk about anchoring bias in conversational interfaces, where early hallucinations can propagate through later answers if memory is not reset between questions. That is not just a technical curiosity. That is a workflow and safety issue. Why should you listen? Because this episode is really about a bigger question: What kind of evidence should pathologists demand before AI copilots enter real workflows? If you want to understand validation, data leakage, rare-case performance, uncertainty, and why these tools should still be treated as co-pilots rather than autopilots, this is a useful paper to know. Episode Highlights01:20 – Why I chose the DALPHIN preprint and why benchmarking matters right now. 05:38 – What is in the DALPHIN dataset: 300 cases, 130 diagnoses, 14 subspecialties, 6 countries. 07:57 – Top-line performance: PathChat+ reaches expert-level performance in 4 of 6 tasks. 09:41 – The benchmarking trap of data leakage and why DALPHIN’s sequestered ground truth matters. 12:19 – Why real pathology diagnosis is not text-only and why macro + micro context matters. 15:26 – Tissue recognition, neoplasm detection, ambiguity, and conversational memory: how the testing was structured. 21:29 – The diagnostic personalities of the models: overcalling, undercalling, and balanced behavior. 24:36 – Rare cancers: where AI copilots still fall short of expert human performance. 28:00 – Why binary outputs are not enough when pathology often lives in uncertainty. 31:37 – Anchoring bias and conversational memory: how early hallucinations can keep propagating. 37:11 – Why these tools should be treated as co-pilots, not autopilots. 40:29 – Resources for beginners: Digital Pathology 101 and continued AI literacy. Resources mentionedDALPHIN preprint: arXiv:2605.03544v1 DALPHIN evaluation platform: dalphin.grand-challenge.org PathChat+ pathology-specific AI model discussed in the benchmark. Digital Pathology 101 free eBook by Dr. Aleksandra Zuraw. Educational streams on tissue recognition and computer vision literacy mentioned in the session.Support the showGet the "Digital Pathology 101" FREE E-book and join us!

Send us Fan Mail Can AI copilots really keep up with pathologists when the cases are new, the workflow is messy, and the benchmark is actually protected from leakage? In this episode of DigiPath Digest #48, I focus on one paper: DALPHIN: Benchmarking Digital Pathology AI Copilots Against Pathologists on an Open Multicentric Dataset. I chose this paper because I think the field needs more of this kind of work. Less hype. More evaluation. Less “look what AI can do.” More “how do we test it in a...

NOW PLAYING

239: Can AI Copilots Keep Up with Pathologists?

0:00 33:25

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

That Hoarder: Overcome Compulsive Hoarding That Hoarder Hoarding disorder is stigmatised and people who hoard feel vast amounts of shame. This podcast began life as an audio diary, an anonymous outlet for somebody with this weird condition. That Hoarder speaks about her experiences living with compulsive hoarding, she interviews therapists, academics, researchers, children of hoarders, professional organisers and influencers, and she shares insight and tips for others with the problem. Listened to by people who hoard as well as those who love them and those who work with them, Overcome Compulsive Hoarding with That Hoarder aims to shatter the stigma, share the truth and speak openly and honestly to improve lives. The Small Business Startup School – Business Notes | Financial Literacy | Retail Psychology – For Professionals & Entrepreneurs The Small Business Startup School Inc. Starting or buying a small business? While personal circumstances may vary, business patterns remain timeless. On The Small Business Startup School, we explore strategies, insights, and practical solutions to help entrepreneurs confidently navigate their journey.Hosted by Ola Williams—a retail entrepreneur, fintech founder, and financial coach with over two decades of experience—this podcast marries financial awareness and retail psychology with optimism to deliver actionable takeaways.Join us to learn, grow, and connect as we uncover the keys to business success.Let’s continue to learn together and be encouraged to keep on connecting! DIOSA. Carolina Sanper This podcast is a sacred space created by Carolina Sanper where you connect with your inner wisdom and embody your magnetic feminine power.It is the realization that the mystical realm is where you plant the seeds of your desired reality.It is a portal to your true essence: awareness, presence, and receiving with ease. Welcome home, DIOSA. 🖤 XXX Tech by SOVRYN Dr. Brian Sovryn The crossroads between technology, sensuality, and metaphysics - and the longest running anarchist podcast in the world! Brought to you by Dr. Brian Sovryn.

Frequently Asked Questions

How long is this episode of Digital Pathology Podcast?

This episode is 33 minutes long.

When was this Digital Pathology Podcast episode published?

This episode was published on June 3, 2026.

What is this episode about?

Send us Fan MailCan AI copilots really keep up with pathologists when the cases are new, the workflow is messy, and the benchmark is actually protected from leakage?In this episode of DigiPath Digest #48, I focus on one paper: DALPHIN: Benchmarking...

Can I download this Digital Pathology Podcast episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!