Efficient Deployment of Models at the Edge // Krishna Sridhar // #284 episode artwork

EPISODE · Jan 17, 2025 · 51 MIN

Efficient Deployment of Models at the Edge // Krishna Sridhar // #284

from MLOps.community · host Demetrios

Krishna Sridhar is an experienced engineering leader passionate about building wonderful products powered by machine learning. Efficient Deployment of Models at the Edge // MLOps Podcast #284 with Krishna Sridhar, Vice President of Qualcomm.Big shout-out to Qualcomm for sponsoring this episode!// AbstractQualcomm® AI Hub helps to optimize, validate, and deploy machine learning models on-device for vision, audio, and speech use cases. With Qualcomm® AI Hub, you can: Convert trained models from frameworks like PyTorch and ONNX for optimized on-device performance on Qualcomm® devices.Profile models on-device to obtain detailed metrics, including runtime, load time, and compute unit utilization. Verify numerical correctness by performing on-device inference. Easily deploy models using Qualcomm® AI Engine Direct, TensorFlow Lite, or ONNX Runtime.The Qualcomm® AI Hub Models repository contains a collection of example models that use Qualcomm® AI Hub to optimize, validate, and deploy models on Qualcomm® devices. Qualcomm® AI Hub automatically handles model translation from source framework to device runtime, applying hardware-aware optimizations, and performs physical performance/numerical validation. The system automatically provisions devices in the cloud for on-device profiling and inference. The following image shows the steps taken to analyze a model using Qualcomm® AI Hub.// BioKrishna Sridhar leads engineering for Qualcomm™ AI Hub, a system used by more than 10,000 AI developers spanning 1,000 companies to run more than 100,000 models on Qualcomm platforms. Prior to joining Qualcomm, he was Co-founder and CEO of Tetra AI, which made it easy to efficiently deploy ML models on mobile/edge hardware. Prior to Tetra AI, Krishna helped design Apple's CoreML, which was a software system mission-critical to running several experiences at Apple, including Camera, Photos, Siri, FaceTime, Watch, and many more across all major Apple device operating systems and all hardware and IP blocks. He has a Ph.D. in computer science from the University of Wisconsin-Madison and a bachelor’s degree in computer science from Birla Institute of Technology and Science, Pilani, India.// MLOps Swag/Merchhttps://shop.mlops.community/// Related LinksWebsite: https://www.linkedin.com/in/srikris/ --------------- ✌️Connect With Us ✌️ -------------Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerCatch all episodes, blogs, newsletters, and more: https://mlops.community/Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Krishna on LinkedIn: https://www.linkedin.com/in/srikris/Timestamps:[00:00] Krishna's preferred coffee[00:12] Takeaways[01:27] Please like, share, leave a review, and subscribe to our MLOps channels![01:56] AI Entrepreneurship Journey[04:25] Core ML and Edge AI[08:44] AI Stack & Workflow Strategy[11:42] On-device AI Foundations[17:15] Hardware vs Software Optimization[21:32] On-device AI Challenges[26:19] Small LLM Orchestration[28:03] Memory Constraints and Shared Pools[30:05] Qualcomm AI Hub Edge[32:53] AI in Unexpected Places[41:53] Deploying AI on Edge[45:58] 4X Battery Optimization Tips[51:00] Wrap up

Krishna Sridhar is an experienced engineering leader passionate about building wonderful products powered by machine learning. Efficient Deployment of Models at the Edge // MLOps Podcast #284 with Krishna Sridhar, Vice President of Qualcomm.Big shout-out to Qualcomm for sponsoring this episode!// AbstractQualcomm® AI Hub helps to optimize, validate, and deploy machine learning models on-device for vision, audio, and speech use cases. With Qualcomm® AI Hub, you can: Convert trained models from frameworks like PyTorch and ONNX for optimized on-device performance on Qualcomm® devices.Profile models on-device to obtain detailed metrics, including runtime, load time, and compute unit utilization. Verify numerical correctness by performing on-device inference. Easily deploy models using Qualcomm® AI Engine Direct, TensorFlow Lite, or ONNX Runtime.The Qualcomm® AI Hub Models repository contains a collection of example models that use Qualcomm® AI Hub to optimize, validate, and deploy models on Qualcomm® devices. Qualcomm® AI Hub automatically handles model translation from source framework to device runtime, applying hardware-aware optimizations, and performs physical performance/numerical validation. The system automatically provisions devices in the cloud for on-device profiling and inference. The following image shows the steps taken to analyze a model using Qualcomm® AI Hub.// BioKrishna Sridhar leads engineering for Qualcomm™ AI Hub, a system used by more than 10,000 AI developers spanning 1,000 companies to run more than 100,000 models on Qualcomm platforms. Prior to joining Qualcomm, he was Co-founder and CEO of Tetra AI, which made it easy to efficiently deploy ML models on mobile/edge hardware. Prior to Tetra AI, Krishna helped design Apple's CoreML, which was a software system mission-critical to running several experiences at Apple, including Camera, Photos, Siri, FaceTime, Watch, and many more across all major Apple device operating systems and all hardware and IP blocks. He has a Ph.D. in computer science from the University of Wisconsin-Madison and a bachelor’s degree in computer science from Birla Institute of Technology and Science, Pilani, India.// MLOps Swag/Merchhttps://shop.mlops.community/// Related LinksWebsite: https://www.linkedin.com/in/srikris/ --------------- ✌️Connect With Us ✌️ -------------Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerCatch all episodes, blogs, newsletters, and more: https://mlops.community/Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Krishna on LinkedIn: https://www.linkedin.com/in/srikris/Timestamps:[00:00] Krishna's preferred coffee[00:12] Takeaways[01:27] Please like, share, leave a review, and subscribe to our MLOps channels![01:56] AI Entrepreneurship Journey[04:25] Core ML and Edge AI[08:44] AI Stack & Workflow Strategy[11:42] On-device AI Foundations[17:15] Hardware vs Software Optimization[21:32] On-device AI Challenges[26:19] Small LLM Orchestration[28:03] Memory Constraints and Shared Pools[30:05] Qualcomm AI Hub Edge[32:53] AI in Unexpected Places[41:53] Deploying AI on Edge[45:58] 4X Battery Optimization Tips[51:00] Wrap up

NOW PLAYING

Efficient Deployment of Models at the Edge // Krishna Sridhar // #284

0:00 51:33

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

She’s a Hazard to Herself She’s a Hazard Hi there, I’m Mallory, and I’d like to invite you into our world with “She’s a Hazard to Herself!” Join us as we navigate life with Multiple Sclerosis from the seat of my power wheelchair. Discover stories of resilience, family, and the community we’ve built around chronic illness. Whether you’re impacted by MS or want to learn from our journey, there’s something here for you. So why wait? Subscribe to “She’s a Hazard to Herself” on your favorite podcast app and be part of our journey today. Let’s lift each other up, one episode at a time! Tips, News and Stories for Older Adults Esther C Kane CAPS, C.D.S. "Tips, News, and Stories for Older Adults" delivers weekly insights tailored for seniors. We bring you summaries of curated news, practical advice, and inspiring stories that matter to the 55+ community. From health and finance to technology and lifestyle, our content keeps you informed and engaged. Sourced from trusted outlets, each episode offers valuable information for navigating your golden years. Join us as we explore aging with positivity, wisdom, and engaging stories. Your perfect companion for staying active, learning, and embracing life's later chapters. Prayer Time Heir Waves Prayer Time A podcast especially for our Prayer Time community NEWMORROW SESSIONS - A PodCast Series on the Future of Hospitality Mario C. Bauer, Florian Schneider, Axel Weber & Dr. Tillman Bardt The Newmorrow PodCast is more than a podcast — it's a platform for open dialog on the future of our business, a platform for those building what doesn’t exist yet. Here, we share and embrace our passion for the hospitality industry, but we won’t romanticize the journey. We ask the tough questions, confront uncomfortable truths, and prepare for a future that resists easy answers. We believe that the tougher and wilder times become, the more openly, honestly and humanely people need to talk to each other and act together. We believe, openness, togetherness, and truthfulness should also be cornerstones of a professional community to develop our utopian idea of „open source“. This is a space where visionaries don’t just imagine the future — they wrestle with the paradoxes that shape it: success vs. happiness, data vs. instinct, stability vs. reinvention. Join leaders, entrepreneurs, and thinkers as they share not what made them — but what’s actively shaping them, now and next. So tune in

Frequently Asked Questions

How long is this episode of MLOps.community?

This episode is 51 minutes long.

When was this MLOps.community episode published?

This episode was published on January 17, 2025.

What is this episode about?

Krishna Sridhar is an experienced engineering leader passionate about building wonderful products powered by machine learning. Efficient Deployment of Models at the Edge // MLOps Podcast #284 with Krishna Sridhar, Vice President of Qualcomm.Big...

Can I download this MLOps.community episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!