Efficient GPU infrastructure at LinkedIn // Animesh Singh // MLOps Podcast #299

EPISODE · Mar 28, 2025 · 59 MIN

Efficient GPU infrastructure at LinkedIn // Animesh Singh // MLOps Podcast #299

from MLOps.community · host Demetrios

Building Trust Through Technology: Responsible AI in Practice // MLOps Podcast #299 with Animesh Singh, Executive Director, AI Platform and Infrastructure of LinkedIn.Join the Community: https://go.mlops.community/YTJoinIn Get the newsletter: https://go.mlops.community/YTNewsletter // AbstractAnimesh discusses LLMs at scale, GPU infrastructure, and optimization strategies. He highlights LinkedIn's use of LLMs for features like profile summarization and hiring assistants, the rising cost of GPUs, and the trade-offs in model deployment. Animesh also touches on real-time training, inference efficiency, and balancing infrastructure costs with AI advancements. The conversation explores the evolving AI landscape, compliance challenges, and simplifying architecture to enhance scalability and talent acquisition.// BioExecutive Director, AI and ML Platform at LinkedIn | Ex IBM Senior Director and Distinguished Engineer, Watson AI and Data | Founder at Kubeflow | Ex LFAI Trusted AI NA ChairAnimesh is the Executive Director leading the next-generation AI and ML Platform at LinkedIn, enabling the creation of the AI Foundation Models Platform, serving the needs of 930+ million members of LinkedIn. Building Distributed Training Platforms, Machine Learning Pipelines, Feature Pipelines, Metadata engines, etc. Leading the creation of the LinkedIn GAI platform for fine-tuning, experimentation, and inference needs. Animesh has more than 20 patents and 50+ publications. Past IBM Watson AI and Data Open Tech CTO, Senior Director, and Distinguished Engineer, with 20+ years of experience in the Software industry, and 15+ years in AI, Data, and Cloud Platform. Led globally dispersed teams, managed globally distributed projects, and served as a trusted adviser to Fortune 500 firms. Played a leadership role in creating, designing, and implementing Data and AI engines for AI and ML platforms, led Trusted AI efforts, and drove the strategy and execution for Kubeflow, OpenDataHub, and execution in products like Watson OpenScale and Watson Machine Learning. // Related LinksComposable Memory for GPU Optimization // Bernie Wu // Pod #270 - https://youtu.be/ccaDEFoKwko~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExploreJoin our Slack community [https://go.mlops.community/slack]Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register]MLOps Swag/Merch: [https://shop.mlops.community/]Connect with Demetrios on LinkedIn: /dpbrinkmConnect with Animesh on LinkedIn: /animeshsingh1Timestamps:[00:00] Animesh's preferred coffee[00:16] Takeaways[02:12] What is working? [07:00] What's not working?[13:40] LLM vs Rexis Efficiency[21:49] GPU Utilization and Architecture[27:32] GPU reliability concerns[36:50] Memory Bottleneck in AI[41:06] Optimizing LLM Checkpointing[46:51] Checkpoint Offloading and Platform Design[54:55] Workflow Divergence Points[58:41] Wrap up

NOW PLAYING

Efficient GPU infrastructure at LinkedIn // Animesh Singh // MLOps Podcast #299

0:00 59:13

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

Photo Breakdown Scott Wyden Kivowitz Photo Breakdown is a podcast in which we explore the world of photography with a trusted guide, host Scott Wyden Kivowitz. His expertise and passion bring the industry to life as we explore the stories, trends, and ideas shaping it today. Join us as we dissect everything from incredible photographs and creative techniques to the latest gear releases and hot topics in the photography community.In each episode, we break down what’s happening behind the scenes - whether it’s making a powerful image, a candid discussion on industry trends, or a reflection on the tools and technology changing how we make photographs. You’ll get insights, expert opinions, and a fresh perspective on what’s top of mind for photographers right now.Anticipate short, engaging episodes brimming with ideas and inspiration. Be part of the conversation by sharing your thoughts, voice notes, and comments. Your participation is what makes our community vibrant and dynamic.It’s more than just photography - everyth Popup Chinese Popup Chinese Fresh from Beijing, PopupChinese teaches Chinese as it is actually spoken. Start with our basic Chinese lessons, and in no time you'll be speaking like a Beijinger. Our free daily podcasts, vibrant community, and love for the real China make us the most powerful and personal way to learn mandarin. Linux Game Cast on Odysee Linux Game Cast Helping the Linux community with gaming, podcasting, live streaming, and audio & video production since 2010. [LinuxGameCast Webzone](https://linuxgamecast.com/) She’s a Hazard to Herself She’s a Hazard Hi there, I’m Mallory, and I’d like to invite you into our world with “She’s a Hazard to Herself!” Join us as we navigate life with Multiple Sclerosis from the seat of my power wheelchair. Discover stories of resilience, family, and the community we’ve built around chronic illness. Whether you’re impacted by MS or want to learn from our journey, there’s something here for you. So why wait? Subscribe to “She’s a Hazard to Herself” on your favorite podcast app and be part of our journey today. Let’s lift each other up, one episode at a time!
URL copied to clipboard!