The Art and Science of Training LLMs // Bandish Shah and Davis Blalock // #219 episode artwork

EPISODE · Mar 22, 2024 · 1H 15M

The Art and Science of Training LLMs // Bandish Shah and Davis Blalock // #219

from MLOps.community · host Demetrios

Join us at our first in-person conference on June 25, all about AI Quality: https://www.aiqualityconference.com/Huge thank you to ⁠Databricks⁠ AI for sponsoring this episode. Bandish Shah is an Engineering Manager at MosaicML/Databricks, where he focuses on making generative AI training and inference efficient, fast, and accessible by bridging the gap between deep learning, large-scale distributed systems, and performance computing.Davis Blalock is a Research Scientist and the first employee of Mosaic ML: a GenAI startup acquired for $1.3 billion by Databricks.MLOps podcast #219 with Databricks' Engineering Manager, Bandish Shah and Research Scientist Davis Blalock, The Art and Science of Training Large Language Models.// AbstractWhat's hard about language models at scale? Turns out...everything. MosaicML's Davis and Bandish share war stories and lessons learned from pushing the limits of LLM training and helping dozens of customers get LLMs into production. They cover what can go wrong at every level of the stack, how to make sure you're building the right solution, and some contrarian takes on the future of efficient models.// BioBandish ShahBandish Shah is an Engineering Manager at MosaicML/Databricks, where he focuses on making generative AI training and inference efficient, fast, and accessible by bridging the gap between deep learning, large-scale distributed systems, and performance computing. Bandish has over a decade of experience building systems for machine learning and enterprise applications. Prior to MosaicML, Bandish held engineering and development roles at SambaNova Systems where he helped develop and ship the first RDU systems from the ground up, and Oracle where he worked as an ASIC engineer for SPARC-based enterprise servers.Davis BlalockDavis Blalock is a research scientist at MosaicML. He completed his PhD at MIT, advised by Professor John Guttag. His primary work is designing high-performance machine learning algorithms. He received his M.S. from MIT and his B.S. from the University of Virginia. He is a Qualcomm Innovation Fellow, NSF Graduate Research Fellow, and Barry M. Goldwater Scholar.// MLOps Jobs board jobs.mlops.community// MLOps Swag/Merchhttps://mlops-community.myshopify.com/// Related LinksAI Quality In-person Conference: AI Quality in Person Conference: https://www.aiqualityconference.com/Website: http://databricks.com/Davis Summarizes Papers ⁠Newsletter signup linkDavis' Newsletters: Learning to recognize spoken words from five unlabeled examples in under two seconds: https://arxiv.org/abs/1609.09196Training on data at 5GB/s in a single thread: https://arxiv.org/abs/1808.02515Nearest-neighbor searching through billions of images per second in one thread with no indexing: https://arxiv.org/abs/1706.10283Multiplying matrices 10-100x faster than a matrix multiply (with some approximation error): https://arxiv.org/abs/2106.10860Hidden Technical Debt in Machine Learning Systems: https://proceedings.neurips.cc/paper_files/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf --------------- ✌️Connect With Us ✌️ -------------Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerCatch all episodes, blogs, newsletters, and more: https://mlops.community/Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Davis on LinkedIn: https://www.linkedin.com/in/dblalock/Connect with Bandish on LinkedIn: https://www.linkedin.com/in/bandish-shah/

Join us at our first in-person conference on June 25, all about AI Quality: https://www.aiqualityconference.com/Huge thank you to ⁠Databricks⁠ AI for sponsoring this episode. Bandish Shah is an Engineering Manager at MosaicML/Databricks, where he focuses on making generative AI training and inference efficient, fast, and accessible by bridging the gap between deep learning, large-scale distributed systems, and performance computing.Davis Blalock is a Research Scientist and the first employee of Mosaic ML: a GenAI startup acquired for $1.3 billion by Databricks.MLOps podcast #219 with Databricks' Engineering Manager, Bandish Shah and Research Scientist Davis Blalock, The Art and Science of Training Large Language Models.// AbstractWhat's hard about language models at scale? Turns out...everything. MosaicML's Davis and Bandish share war stories and lessons learned from pushing the limits of LLM training and helping dozens of customers get LLMs into production. They cover what can go wrong at every level of the stack, how to make sure you're building the right solution, and some contrarian takes on the future of efficient models.// BioBandish ShahBandish Shah is an Engineering Manager at MosaicML/Databricks, where he focuses on making generative AI training and inference efficient, fast, and accessible by bridging the gap between deep learning, large-scale distributed systems, and performance computing. Bandish has over a decade of experience building systems for machine learning and enterprise applications. Prior to MosaicML, Bandish held engineering and development roles at SambaNova Systems where he helped develop and ship the first RDU systems from the ground up, and Oracle where he worked as an ASIC engineer for SPARC-based enterprise servers.Davis BlalockDavis Blalock is a research scientist at MosaicML. He completed his PhD at MIT, advised by Professor John Guttag. His primary work is designing high-performance machine learning algorithms. He received his M.S. from MIT and his B.S. from the University of Virginia. He is a Qualcomm Innovation Fellow, NSF Graduate Research Fellow, and Barry M. Goldwater Scholar.// MLOps Jobs board jobs.mlops.community// MLOps Swag/Merchhttps://mlops-community.myshopify.com/// Related LinksAI Quality In-person Conference: AI Quality in Person Conference: https://www.aiqualityconference.com/Website: http://databricks.com/Davis Summarizes Papers ⁠Newsletter signup linkDavis' Newsletters: Learning to recognize spoken words from five unlabeled examples in under two seconds: https://arxiv.org/abs/1609.09196Training on data at 5GB/s in a single thread: https://arxiv.org/abs/1808.02515Nearest-neighbor searching through billions of images per second in one thread with no indexing: https://arxiv.org/abs/1706.10283Multiplying matrices 10-100x faster than a matrix multiply (with some approximation error): https://arxiv.org/abs/2106.10860Hidden Technical Debt in Machine Learning Systems: https://proceedings.neurips.cc/paper_files/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf --------------- ✌️Connect With Us ✌️ -------------Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerCatch all episodes, blogs, newsletters, and more: https://mlops.community/Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Davis on LinkedIn: https://www.linkedin.com/in/dblalock/Connect with Bandish on LinkedIn: https://www.linkedin.com/in/bandish-shah/

NOW PLAYING

The Art and Science of Training LLMs // Bandish Shah and Davis Blalock // #219

0:00 1:15:11

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

She’s a Hazard to Herself She’s a Hazard Hi there, I’m Mallory, and I’d like to invite you into our world with “She’s a Hazard to Herself!” Join us as we navigate life with Multiple Sclerosis from the seat of my power wheelchair. Discover stories of resilience, family, and the community we’ve built around chronic illness. Whether you’re impacted by MS or want to learn from our journey, there’s something here for you. So why wait? Subscribe to “She’s a Hazard to Herself” on your favorite podcast app and be part of our journey today. Let’s lift each other up, one episode at a time! Tips, News and Stories for Older Adults Esther C Kane CAPS, C.D.S. "Tips, News, and Stories for Older Adults" delivers weekly insights tailored for seniors. We bring you summaries of curated news, practical advice, and inspiring stories that matter to the 55+ community. From health and finance to technology and lifestyle, our content keeps you informed and engaged. Sourced from trusted outlets, each episode offers valuable information for navigating your golden years. Join us as we explore aging with positivity, wisdom, and engaging stories. Your perfect companion for staying active, learning, and embracing life's later chapters. Prayer Time Heir Waves Prayer Time A podcast especially for our Prayer Time community NEWMORROW SESSIONS - A PodCast Series on the Future of Hospitality Mario C. Bauer, Florian Schneider, Axel Weber & Dr. Tillman Bardt The Newmorrow PodCast is more than a podcast — it's a platform for open dialog on the future of our business, a platform for those building what doesn’t exist yet. Here, we share and embrace our passion for the hospitality industry, but we won’t romanticize the journey. We ask the tough questions, confront uncomfortable truths, and prepare for a future that resists easy answers. We believe that the tougher and wilder times become, the more openly, honestly and humanely people need to talk to each other and act together. We believe, openness, togetherness, and truthfulness should also be cornerstones of a professional community to develop our utopian idea of „open source“. This is a space where visionaries don’t just imagine the future — they wrestle with the paradoxes that shape it: success vs. happiness, data vs. instinct, stability vs. reinvention. Join leaders, entrepreneurs, and thinkers as they share not what made them — but what’s actively shaping them, now and next. So tune in

Frequently Asked Questions

How long is this episode of MLOps.community?

This episode is 1 hour and 15 minutes long.

When was this MLOps.community episode published?

This episode was published on March 22, 2024.

What is this episode about?

Join us at our first in-person conference on June 25, all about AI Quality: https://www.aiqualityconference.com/Huge thank you to ⁠Databricks⁠ AI for sponsoring this episode. Bandish Shah is an Engineering Manager at MosaicML/Databricks, where he...

Can I download this MLOps.community episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!