MLOps Coffee Sessions #10 Analyzing the Article “Continuous Delivery and Automation Pipelines in Machine Learning" // Part 2 episode artwork

EPISODE · Sep 22, 2020 · 1H 7M

MLOps Coffee Sessions #10 Analyzing the Article “Continuous Delivery and Automation Pipelines in Machine Learning" // Part 2

from MLOps.community · host Demetrios

Second installation, David and Demetrios are reviewing the Google paper about Continuous training and automated pipelines. They dive deep into machine learning monitoring and also what exactly continuous training actually entails. Some key highlights are:Automatically retraining and serving the models: When to do it?Outlier detectionDrift detectionOutlier detection:What is it?How you deal with itDrift detectionIndividual features may start to drift. This could be a bug, or it could be perfectly normal behavior that indicates that the world has changed, requiring the model to be retrained.Example changes:shifts in people’s preferencesmarketing campaignscompetitor movesthe weatherthe news cycleLocationsTimeDevices (clients)If the world you're working with is changing over time, model deployment should be treated as a continuous process. What this tells me is that you should keep the data scientists and engineers working on the model instead of immediately moving to another project.Deeper dive into concept driftFeature/target distributions changeAn overview of concept drift applications: “.. data analysis applications, data evolve over time and must be analyzed in near real time. Patterns and relations in such data often evolve over time; thus, models built for analyzing such data quickly become obsolete over time. In machine learning and data mining, this phenomenon is referred to as concept drift.”https://www.win.tue.nl/~mpechen/publications/pubs/CD_applications15.pdfhttps://www-ai.cs.tu-dortmund.de/LEHRE/FACHPROJEKT/SS12/paper/concept-drift/tsymbal2004.pdfTypes of concept drift:SuddenGradualGoogle, in some way, is trying to address this concern - the world is changing, and you want your ML system to change as well, so it can avoid decreased performance but also improve over time and adapt to its environment. This sort of robustness is necessary for certain domains.Continuous delivery and automation of pipelines (data, training, prediction service) was built with this in mind. Minimizing the commit-to-deploy interval and maximizing the velocity of software delivery and its components: maintainability, extensibility, and testabilityThen the pipeline is ready, you can now run it. So you can do this continuously. After the pipeline is deployed to the production environment, it will be executed automatically and repetitively to produce a trained model that is stored in a central model registry.This pipeline should be able to be run on a schedule or based on triggers: certain events that you have configured for your business domain - new data or drop in performance from the prod model.The link between the model artifact and the pipeline is never severed. What pipeline trained them? What data was extracted, validated, and how was it prepared? What was the training configuration, and how was it evaluated? Etc. metrics are key here! Lineage tracking!!!Keeping a close tie between the dev/experiment pipeline and the continuous production pipeline helps avoid inconsistencies between model artifacts produced by the pipeline and models being served - hard to debugJoin our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerConnect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/Connect with Cris Sterry on LinkedIn: https://www.linkedin.com/in/chrissterry/

Second installation, David and Demetrios are reviewing the Google paper about Continuous training and automated pipelines. They dive deep into machine learning monitoring and also what exactly continuous training actually entails. Some key highlights are:Automatically retraining and serving the models: When to do it?Outlier detectionDrift detectionOutlier detection:What is it?How you deal with itDrift detectionIndividual features may start to drift. This could be a bug, or it could be perfectly normal behavior that indicates that the world has changed, requiring the model to be retrained.Example changes:shifts in people’s preferencesmarketing campaignscompetitor movesthe weatherthe news cycleLocationsTimeDevices (clients)If the world you're working with is changing over time, model deployment should be treated as a continuous process. What this tells me is that you should keep the data scientists and engineers working on the model instead of immediately moving to another project.Deeper dive into concept driftFeature/target distributions changeAn overview of concept drift applications: “.. data analysis applications, data evolve over time and must be analyzed in near real time. Patterns and relations in such data often evolve over time; thus, models built for analyzing such data quickly become obsolete over time. In machine learning and data mining, this phenomenon is referred to as concept drift.”https://www.win.tue.nl/~mpechen/publications/pubs/CD_applications15.pdfhttps://www-ai.cs.tu-dortmund.de/LEHRE/FACHPROJEKT/SS12/paper/concept-drift/tsymbal2004.pdfTypes of concept drift:SuddenGradualGoogle, in some way, is trying to address this concern - the world is changing, and you want your ML system to change as well, so it can avoid decreased performance but also improve over time and adapt to its environment. This sort of robustness is necessary for certain domains.Continuous delivery and automation of pipelines (data, training, prediction service) was built with this in mind. Minimizing the commit-to-deploy interval and maximizing the velocity of software delivery and its components: maintainability, extensibility, and testabilityThen the pipeline is ready, you can now run it. So you can do this continuously. After the pipeline is deployed to the production environment, it will be executed automatically and repetitively to produce a trained model that is stored in a central model registry.This pipeline should be able to be run on a schedule or based on triggers: certain events that you have configured for your business domain - new data or drop in performance from the prod model.The link between the model artifact and the pipeline is never severed. What pipeline trained them? What data was extracted, validated, and how was it prepared? What was the training configuration, and how was it evaluated? Etc. metrics are key here! Lineage tracking!!!Keeping a close tie between the dev/experiment pipeline and the continuous production pipeline helps avoid inconsistencies between model artifacts produced by the pipeline and models being served - hard to debugJoin our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerConnect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/Connect with Cris Sterry on LinkedIn: https://www.linkedin.com/in/chrissterry/

NOW PLAYING

MLOps Coffee Sessions #10 Analyzing the Article “Continuous Delivery and Automation Pipelines in Machine Learning" // Part 2

0:00 1:07:59

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

She’s a Hazard to Herself She’s a Hazard Hi there, I’m Mallory, and I’d like to invite you into our world with “She’s a Hazard to Herself!” Join us as we navigate life with Multiple Sclerosis from the seat of my power wheelchair. Discover stories of resilience, family, and the community we’ve built around chronic illness. Whether you’re impacted by MS or want to learn from our journey, there’s something here for you. So why wait? Subscribe to “She’s a Hazard to Herself” on your favorite podcast app and be part of our journey today. Let’s lift each other up, one episode at a time! Tips, News and Stories for Older Adults Esther C Kane CAPS, C.D.S. "Tips, News, and Stories for Older Adults" delivers weekly insights tailored for seniors. We bring you summaries of curated news, practical advice, and inspiring stories that matter to the 55+ community. From health and finance to technology and lifestyle, our content keeps you informed and engaged. Sourced from trusted outlets, each episode offers valuable information for navigating your golden years. Join us as we explore aging with positivity, wisdom, and engaging stories. Your perfect companion for staying active, learning, and embracing life's later chapters. Prayer Time Heir Waves Prayer Time A podcast especially for our Prayer Time community NEWMORROW SESSIONS - A PodCast Series on the Future of Hospitality Mario C. Bauer, Florian Schneider, Axel Weber & Dr. Tillman Bardt The Newmorrow PodCast is more than a podcast — it's a platform for open dialog on the future of our business, a platform for those building what doesn’t exist yet. Here, we share and embrace our passion for the hospitality industry, but we won’t romanticize the journey. We ask the tough questions, confront uncomfortable truths, and prepare for a future that resists easy answers. We believe that the tougher and wilder times become, the more openly, honestly and humanely people need to talk to each other and act together. We believe, openness, togetherness, and truthfulness should also be cornerstones of a professional community to develop our utopian idea of „open source“. This is a space where visionaries don’t just imagine the future — they wrestle with the paradoxes that shape it: success vs. happiness, data vs. instinct, stability vs. reinvention. Join leaders, entrepreneurs, and thinkers as they share not what made them — but what’s actively shaping them, now and next. So tune in

Frequently Asked Questions

How long is this episode of MLOps.community?

This episode is 1 hour and 7 minutes long.

When was this MLOps.community episode published?

This episode was published on September 22, 2020.

What is this episode about?

Second installation, David and Demetrios are reviewing the Google paper about Continuous training and automated pipelines. They dive deep into machine learning monitoring and also what exactly continuous training actually entails. Some key...

Can I download this MLOps.community episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!