Better Use cases for Text Embeddings // Vincent Warmerdam // MLOps Coffee Sessions #83 episode artwork

EPISODE · Feb 28, 2022 · 48 MIN

Better Use cases for Text Embeddings // Vincent Warmerdam // MLOps Coffee Sessions #83

from MLOps.community · host Demetrios

MLOps Coffee Sessions #83 with Vincent Warmerdam, Better Use Cases for Text Embeddings.Join the Community: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTJoinIn⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Get the newsletter: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTNewsletter⁠⁠⁠⁠// Abstract Text embeddings are very popular, but there are plenty of reasons to be concerned about their applications. There's algorithmic fairness, compute requirements, as well as issues with the datasets that they're typically trained on.In this session, Vincent gives an overview of some of these properties while also talking about an underappreciated use-case for the embeddings: labeling!// Bio Vincent D. Warmerdam is a senior data professional who has worked as an engineer, researcher, team lead, and educator in the past. He's especially interested in understanding algorithmic systems so that one can prevent failure. As such, he has a preference for simpler solutions that scale, as opposed to the latest and greatest from the hype cycle. He currently works as a Research Advocate at Rasa, where he collaborates with the research team to explain and understand conversational systems better.Outside of Rasa, Vincent is also well known for his open-source projects (scikit-lego, human-learn, doubtlab, and more), collaborations with open source projects like spaCy, his blog over at koaning.io, and his calm code educational project.--------------- ✌️Connect With Us ✌️ ------------- Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/registerCatch all episodes, blogs, newsletter, and more: https://mlops.community/Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Skylar on LinkedIn: https://www.linkedin.com/in/skylar-payne-766a1988/Connect with Vincent on LinkedIn: https://www.linkedin.com/in/vincentwarmerdam/Timestamps:[00:00] Takeaways[04:10] Favorite purchases this pandemic [05:05] What drives Vincent to understand how ML can fail?[08:33] How and why to make systems simpler?[11:37] Techniques shared by Vincent in his talks[15:51] ML as a UI problem[17:02] Figuring out rules in your data[20:01] Detecting bad labels[23:53] Labeling isn't necessarily easy[25:48] Fraud use case[27:42] How does Vincent stay sane looking for frauds?[29:12] How does Vincent produce so many packages?[31:23] Vincent's favorite package[33:24] Explosion AI[36:14] Python all the way[37:44] Shift from model-centric to data-centric AI[39:35] Talking about the problem is necessary[40:40] Vincent's war stories[44:04] Adding constraints to the system[47:49] Wrap up

MLOps Coffee Sessions #83 with Vincent Warmerdam, Better Use Cases for Text Embeddings.Join the Community: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTJoinIn⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Get the newsletter: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTNewsletter⁠⁠⁠⁠// Abstract Text embeddings are very popular, but there are plenty of reasons to be concerned about their applications. There's algorithmic fairness, compute requirements, as well as issues with the datasets that they're typically trained on.In this session, Vincent gives an overview of some of these properties while also talking about an underappreciated use-case for the embeddings: labeling!// Bio Vincent D. Warmerdam is a senior data professional who has worked as an engineer, researcher, team lead, and educator in the past. He's especially interested in understanding algorithmic systems so that one can prevent failure. As such, he has a preference for simpler solutions that scale, as opposed to the latest and greatest from the hype cycle. He currently works as a Research Advocate at Rasa, where he collaborates with the research team to explain and understand conversational systems better.Outside of Rasa, Vincent is also well known for his open-source projects (scikit-lego, human-learn, doubtlab, and more), collaborations with open source projects like spaCy, his blog over at koaning.io, and his calm code educational project.--------------- ✌️Connect With Us ✌️ ------------- Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/registerCatch all episodes, blogs, newsletter, and more: https://mlops.community/Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Skylar on LinkedIn: https://www.linkedin.com/in/skylar-payne-766a1988/Connect with Vincent on LinkedIn: https://www.linkedin.com/in/vincentwarmerdam/Timestamps:[00:00] Takeaways[04:10] Favorite purchases this pandemic [05:05] What drives Vincent to understand how ML can fail?[08:33] How and why to make systems simpler?[11:37] Techniques shared by Vincent in his talks[15:51] ML as a UI problem[17:02] Figuring out rules in your data[20:01] Detecting bad labels[23:53] Labeling isn't necessarily easy[25:48] Fraud use case[27:42] How does Vincent stay sane looking for frauds?[29:12] How does Vincent produce so many packages?[31:23] Vincent's favorite package[33:24] Explosion AI[36:14] Python all the way[37:44] Shift from model-centric to data-centric AI[39:35] Talking about the problem is necessary[40:40] Vincent's war stories[44:04] Adding constraints to the system[47:49] Wrap up

NOW PLAYING

Better Use cases for Text Embeddings // Vincent Warmerdam // MLOps Coffee Sessions #83

0:00 48:22

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

She’s a Hazard to Herself She’s a Hazard Hi there, I’m Mallory, and I’d like to invite you into our world with “She’s a Hazard to Herself!” Join us as we navigate life with Multiple Sclerosis from the seat of my power wheelchair. Discover stories of resilience, family, and the community we’ve built around chronic illness. Whether you’re impacted by MS or want to learn from our journey, there’s something here for you. So why wait? Subscribe to “She’s a Hazard to Herself” on your favorite podcast app and be part of our journey today. Let’s lift each other up, one episode at a time! Tips, News and Stories for Older Adults Esther C Kane CAPS, C.D.S. "Tips, News, and Stories for Older Adults" delivers weekly insights tailored for seniors. We bring you summaries of curated news, practical advice, and inspiring stories that matter to the 55+ community. From health and finance to technology and lifestyle, our content keeps you informed and engaged. Sourced from trusted outlets, each episode offers valuable information for navigating your golden years. Join us as we explore aging with positivity, wisdom, and engaging stories. Your perfect companion for staying active, learning, and embracing life's later chapters. Prayer Time Heir Waves Prayer Time A podcast especially for our Prayer Time community NEWMORROW SESSIONS - A PodCast Series on the Future of Hospitality Mario C. Bauer, Florian Schneider, Axel Weber & Dr. Tillman Bardt The Newmorrow PodCast is more than a podcast — it's a platform for open dialog on the future of our business, a platform for those building what doesn’t exist yet. Here, we share and embrace our passion for the hospitality industry, but we won’t romanticize the journey. We ask the tough questions, confront uncomfortable truths, and prepare for a future that resists easy answers. We believe that the tougher and wilder times become, the more openly, honestly and humanely people need to talk to each other and act together. We believe, openness, togetherness, and truthfulness should also be cornerstones of a professional community to develop our utopian idea of „open source“. This is a space where visionaries don’t just imagine the future — they wrestle with the paradoxes that shape it: success vs. happiness, data vs. instinct, stability vs. reinvention. Join leaders, entrepreneurs, and thinkers as they share not what made them — but what’s actively shaping them, now and next. So tune in

Frequently Asked Questions

How long is this episode of MLOps.community?

This episode is 48 minutes long.

When was this MLOps.community episode published?

This episode was published on February 28, 2022.

What is this episode about?

MLOps Coffee Sessions #83 with Vincent Warmerdam, Better Use Cases for Text Embeddings.Join the Community:...

Can I download this MLOps.community episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!