Tricks to Fine Tuning // Prithviraj Ammanabrolu // #318 episode artwork

EPISODE · Jun 11, 2025 · 54 MIN

Tricks to Fine Tuning // Prithviraj Ammanabrolu // #318

from MLOps.community · host Demetrios

Tricks to Fine Tuning // MLOps Podcast #318 with Prithviraj Ammanabrolu, Research Scientist at Databricks. Join the Community: https://go.mlops.community/YTJoinIn Get the newsletter: https://go.mlops.community/YTNewsletter // AbstractPrithviraj Ammanabrolu drops by to break down Tao fine-tuning—a clever way to train models without labeled data. Using reinforcement learning and synthetic data, Tao teaches models to evaluate and improve themselves. Raj explains how this works, where it shines (think small models punching above their weight), and why it could be a game-changer for efficient deployment.// BioRaj is an Assistant Professor of Computer Science at the University of California, San Diego, leading the PEARLS Lab in the Department of Computer Science and Engineering (CSE). He is also a Research Scientist at Mosaic AI, Databricks, where his team is actively recruiting research scientists and engineers with expertise in reinforcement learning and distributed systems.Previously, he was part of the Mosaic team at the Allen Institute for AI. He earned his PhD in Computer Science from the School of Interactive Computing at Georgia Tech, advised by Professor Mark Riedl in the Entertainment Intelligence Lab.// Related LinksWebsite: https://www.databricks.com/~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExploreJoin our Slack community [https://go.mlops.community/slack]Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register]MLOps Swag/Merch: [https://shop.mlops.community/]Connect with Demetrios on LinkedIn: /dpbrinkmConnect with Raj on LinkedIn: /rajammanabroluTimestamps:[00:00] Raj's preferred coffee[00:36] Takeaways[01:02] Tao Naming Decision[04:19] No Labels Machine Learning[08:09] Tao and TAO breakdown[13:20] Reward Model Fine-Tuning[18:15] Training vs Inference Compute[22:32] Retraining and Model Drift[29:06] Prompt Tuning vs Fine-Tuning[34:32] Small Model Optimization Strategies[37:10] Small Model Potential[43:08] Fine-tuning Model Differences[46:02] Mistral Model Freedom[53:46] Wrap up

Tricks to Fine Tuning // MLOps Podcast #318 with Prithviraj Ammanabrolu, Research Scientist at Databricks. Join the Community: https://go.mlops.community/YTJoinIn Get the newsletter: https://go.mlops.community/YTNewsletter // AbstractPrithviraj Ammanabrolu drops by to break down Tao fine-tuning—a clever way to train models without labeled data. Using reinforcement learning and synthetic data, Tao teaches models to evaluate and improve themselves. Raj explains how this works, where it shines (think small models punching above their weight), and why it could be a game-changer for efficient deployment.// BioRaj is an Assistant Professor of Computer Science at the University of California, San Diego, leading the PEARLS Lab in the Department of Computer Science and Engineering (CSE). He is also a Research Scientist at Mosaic AI, Databricks, where his team is actively recruiting research scientists and engineers with expertise in reinforcement learning and distributed systems.Previously, he was part of the Mosaic team at the Allen Institute for AI. He earned his PhD in Computer Science from the School of Interactive Computing at Georgia Tech, advised by Professor Mark Riedl in the Entertainment Intelligence Lab.// Related LinksWebsite: https://www.databricks.com/~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExploreJoin our Slack community [https://go.mlops.community/slack]Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register]MLOps Swag/Merch: [https://shop.mlops.community/]Connect with Demetrios on LinkedIn: /dpbrinkmConnect with Raj on LinkedIn: /rajammanabroluTimestamps:[00:00] Raj's preferred coffee[00:36] Takeaways[01:02] Tao Naming Decision[04:19] No Labels Machine Learning[08:09] Tao and TAO breakdown[13:20] Reward Model Fine-Tuning[18:15] Training vs Inference Compute[22:32] Retraining and Model Drift[29:06] Prompt Tuning vs Fine-Tuning[34:32] Small Model Optimization Strategies[37:10] Small Model Potential[43:08] Fine-tuning Model Differences[46:02] Mistral Model Freedom[53:46] Wrap up

NOW PLAYING

Tricks to Fine Tuning // Prithviraj Ammanabrolu // #318

0:00 54:01

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

She’s a Hazard to Herself She’s a Hazard Hi there, I’m Mallory, and I’d like to invite you into our world with “She’s a Hazard to Herself!” Join us as we navigate life with Multiple Sclerosis from the seat of my power wheelchair. Discover stories of resilience, family, and the community we’ve built around chronic illness. Whether you’re impacted by MS or want to learn from our journey, there’s something here for you. So why wait? Subscribe to “She’s a Hazard to Herself” on your favorite podcast app and be part of our journey today. Let’s lift each other up, one episode at a time! Tips, News and Stories for Older Adults Esther C Kane CAPS, C.D.S. "Tips, News, and Stories for Older Adults" delivers weekly insights tailored for seniors. We bring you summaries of curated news, practical advice, and inspiring stories that matter to the 55+ community. From health and finance to technology and lifestyle, our content keeps you informed and engaged. Sourced from trusted outlets, each episode offers valuable information for navigating your golden years. Join us as we explore aging with positivity, wisdom, and engaging stories. Your perfect companion for staying active, learning, and embracing life's later chapters. Prayer Time Heir Waves Prayer Time A podcast especially for our Prayer Time community NEWMORROW SESSIONS - A PodCast Series on the Future of Hospitality Mario C. Bauer, Florian Schneider, Axel Weber & Dr. Tillman Bardt The Newmorrow PodCast is more than a podcast — it's a platform for open dialog on the future of our business, a platform for those building what doesn’t exist yet. Here, we share and embrace our passion for the hospitality industry, but we won’t romanticize the journey. We ask the tough questions, confront uncomfortable truths, and prepare for a future that resists easy answers. We believe that the tougher and wilder times become, the more openly, honestly and humanely people need to talk to each other and act together. We believe, openness, togetherness, and truthfulness should also be cornerstones of a professional community to develop our utopian idea of „open source“. This is a space where visionaries don’t just imagine the future — they wrestle with the paradoxes that shape it: success vs. happiness, data vs. instinct, stability vs. reinvention. Join leaders, entrepreneurs, and thinkers as they share not what made them — but what’s actively shaping them, now and next. So tune in

Frequently Asked Questions

How long is this episode of MLOps.community?

This episode is 54 minutes long.

When was this MLOps.community episode published?

This episode was published on June 11, 2025.

What is this episode about?

Tricks to Fine Tuning // MLOps Podcast #318 with Prithviraj Ammanabrolu, Research Scientist at Databricks. Join the Community: https://go.mlops.community/YTJoinIn Get the newsletter: https://go.mlops.community/YTNewsletter // AbstractPrithviraj...

Can I download this MLOps.community episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!