EVA - A Framework for Evaluating Voice Agents by ServiceNow episode artwork

EPISODE · Apr 29, 2026 · 29 MIN

EVA - A Framework for Evaluating Voice Agents by ServiceNow

from ServiceNow Podcasts · host ServiceNow Community Podcasts

Voice AI agent evaluation — why it's fundamentally harder than text, how cascade failures derail conversations invisibly, and ServiceNow's open-source framework to establish industry evaluation standards. Featuring real audio examples showing authentication failures, leaked reasoning, and latency problems. WHAT WE COVER  TARA BOGAVELLI — Research Engineer, ServiceNow Leading the open-source voice agent evaluation framework. Explains why existing benchmarks don't measure what matters and what ServiceNow is releasing to establish industry standards. KATRINA STANKIEWICZ — Staff Machine Learning Engineer, ServiceNow Cascade model architecture expert. Breaks down STT → LLM → TTS failure modes, named entity transcription challenges, and real audio example analysis. GABRIELLE GAUTHIER MELANÇON — Staff Applied Research Scientist, ServiceNow Multi-language evaluation specialist. Reveals why Large Audio Language Models lag behind, the native speaker requirement, and bot-to-bot simulation methodology.  CHAPTERS0:00 Introduction — The evaluation gap 1:11 ServiceNow's Open-Source Framework Announcement — Tara Bogavelli 2:43 Meet the Researchers 3:43 Voice-Specific Challenges — Tara Bogavelli 5:03 Cascade Architecture: STT → LLM → TTS — Katrina Stankiewicz 7:57 The Named Entity Problem — Katrina Stankiewicz 10:06 Evaluation Metrics: Accuracy vs Experience — Gabrielle Gauthier Melançon 11:23 Bot-to-Bot Testing at Scale — Gabrielle Gauthier Melançon 14:30 The LALM Gap: Why Audio AI Judges Struggle — Tara Bogavelli16:57 Real Audio Example: Flight Rebooking Gone Wrong 21:58 Breaking Down the Failures — Katrina Stankiewicz 28:30 Wrap-Up & Resources KEY INSIGHTS The Cascade Failure Problem: STT → LLM → TTS errors propagate invisibly Named Entity Transcription: The #1 enterprise blocker—names, confirmation codes, emails break authentication Accuracy vs Experience: Perfect task completion means nothing if users hang up due to poor experience LALM Gap: Large Audio Language Models lag behind text LLMs—human evaluators remain essential Latency Kills Conversations: Five-second pauses make users think the call dropped, breaking the experience even when tasks complete Open-Source Framework: ServiceNow releasing evaluation tools, metrics, and bot-to-bot simulation methodology for the industry. LEARN MORE Website: https://servicenow.github.io/eva/ GitHub: https://github.com/servicenow/eva Blog Post: https://huggingface.co/blog/ServiceNow-AI/eva Dataset: https://huggingface.co/datasets/ServiceNow-AI/eva ABOUT Hosted by Bobby Brill. ServiceNow Insights podcast explores AI research, real-world applications, and the people building the future of work. #VoiceAI #AIEvaluation #ServiceNow #MachineLearning #OpenSource #ConversationalAI #STT #TTS #LLM #VoiceAgents #AIResearch #PodcastSee omnystudio.com/listener for privacy information.

NOW PLAYING

EVA - A Framework for Evaluating Voice Agents by ServiceNow

0:00 29:37

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

French Your Way Jessica: Native French teacher founder of French Your Way Boost your French listening skills and test your comprehension with this one of a kind series of podcasts. Get the chance to listen to a real conversation between native speakers talking at normal speed AND customise your learning experience through carefully designed sets of questions (2 levels of difficulty) available for download at www.frenchvoicespodcast.com. All interviews also come with the transcript. French teacher Jessica interviews native speakers of French from around the world who share a bit of their life and passion. Where else would you meet in one same place a French yoga teacher based in Melbourne, a soap manufacturer from Provence, or a couple cycling around the world? The Lee Olsen Show Lee Olsen CJF I want to help you improve all areas of your life by 3 types of podcasts!👉Blood, Sweat & Blessings-Interviews of normal people that have achieved BIG things!👉Series!!! For Love of the Horse- Brad Jackman DVM & Lee Olsen CJF, how to help your horse!👉Business Tips- Proven Life Changing Business Strategies with Lee Olsen 🎙️Truth and Testimony the Broadcast Ray Gauthier & Adrian Scott This Podcast discusses and teaches the word of God. You will hear about world news and how it relates to bible prophecy. You will also hear interviews and testimonies from men and women of God who have devoted their lives to serving Yeshua (Jesus). Hosted by Ray Gauthier and Adrian Scott. These two long term broadcast colleagues have joined forces once again to provide you the highest quality in broadcast excellence, all for the glory of Yahweh: the God of all creation!You can see most of the podcasts uploaded here at our Youtube Channel.https://www.youtube.com/@truthandtestimonythebroadcast Rich and Weekly Wondery We’re fascinated with the lives of both the famous and the infamous. From the Kardashians and Kendrick to Britney and the Bravo-verse, RICH AND WEEKLY is your dose of the hottest and latest celebrity news. Brooke Siffrinn and Aricia Skidmore-Williams, hosts of the hit series Even the Rich and Even the Royals, spill the tea, dish the dirt, and tell you exactly what they think. New episodes drop every Thursday.Listen to Rich and Weekly on the Wondery App or wherever you get your podcasts. You can listen to all episodes ad-free on Wondery+. Join Wondery+ in the Wondery App, Apple Podcasts or Spotify. Start your free trial by visiting wondery.com/links/rich-and-weekly/ now.

Frequently Asked Questions

How long is this episode of ServiceNow Podcasts?

This episode is 29 minutes long.

When was this ServiceNow Podcasts episode published?

This episode was published on April 29, 2026.

What is this episode about?

Voice AI agent evaluation — why it's fundamentally harder than text, how cascade failures derail conversations invisibly, and ServiceNow's open-source framework to establish industry evaluation standards. Featuring real audio examples showing...

Can I download this ServiceNow Podcasts episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!