When Training Data and Real Data Diverge

EPISODE · May 23, 2026 · 8 MIN

When Training Data and Real Data Diverge

from The Data Science Podcast with Fexingo: Analytics, Machine Learning, and Data-Driven Conversations · host Fexingo

Episode 7 of The Data Science Podcast tackles the critical concept of distribution shift — what happens when the data your model sees in production differs significantly from its training data. Lucas and Luna walk through a concrete example from a ride-hailing app that saw its demand prediction model fail during a holiday surge. They explain covariate shift, prior probability shift, and concept drift using that real case, and discuss practical detection methods including statistical tests like the Kolmogorov-Smirnov test and population stability index. The episode also covers monitoring strategies and retraining triggers, giving listeners actionable takeaways for building robust ML systems. No ads — just clear, specific data science conversation. #DataScience #MachineLearning #DistributionShift #CovariateShift #ConceptDrift #ModelMonitoring #MLEngineering #RideHailing #DemandPrediction #KolmogorovSmirnov #PopulationStabilityIndex #DataDrift #ProductionML #ModelRetraining #TechPodcast #FexingoBusiness #BusinessPodcast #Technology Keep every episode free: buymeacoffee.com/fexingo

NOW PLAYING

When Training Data and Real Data Diverge

0:00 8:18

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

URL copied to clipboard!