791: Reinforcement Learning from Human Feedback (RLHF), with Dr. Nathan Lambert episode artwork

EPISODE · Jun 11, 2024 · 57 MIN

791: Reinforcement Learning from Human Feedback (RLHF), with Dr. Nathan Lambert

from Super Data Science: ML & AI Podcast with Jon Krohn · host Jon Krohn

Reinforcement learning through human feedback (RLHF) has come a long way. In this episode, research scientist Nathan Lambert talks to Jon Krohn about the technique’s origins of the technique. He also walks through other ways to fine-tune LLMs, and how he believes generative AI might democratize education. This episode is brought to you by AWS Inferentia (go.aws/3zWS0au) and AWS Trainium (go.aws/3ycV6K0), and Crawlbase (crawlbase.com), the ultimate data crawling platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • Why it is important that AI is open [03:13] • The efficacy and scalability of direct preference optimization [07:32] • Robotics and LLMs [14:32] • The challenges to aligning reward models with human preferences [23:00] • How to make sure AI’s decision making on preferences reflect desirable behavior [28:52] • Why Nathan believes AI is closer to alchemy than science [37:38] Additional materials: www.superdatascience.com/791

Reinforcement learning through human feedback (RLHF) has come a long way. In this episode, research scientist Nathan Lambert talks to Jon Krohn about the technique’s origins of the technique. He also walks through other ways to fine-tune LLMs, and how he believes generative AI might democratize education. This episode is brought to you by AWS Inferentia (go.aws/3zWS0au) and AWS Trainium (go.aws/3ycV6K0), and Crawlbase (crawlbase.com), the ultimate data crawling platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • Why it is important that AI is open [03:13] • The efficacy and scalability of direct preference optimization [07:32] • Robotics and LLMs [14:32] • The challenges to aligning reward models with human preferences [23:00] • How to make sure AI’s decision making on preferences reflect desirable behavior [28:52] • Why Nathan believes AI is closer to alchemy than science [37:38] Additional materials: www.superdatascience.com/791

NOW PLAYING

791: Reinforcement Learning from Human Feedback (RLHF), with Dr. Nathan Lambert

0:00 57:10

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

MG Show MG Show The MG Show, hosted by Jeffrey Pedersen and Shannon Townsend, is a leading alternative media platform dedicated to uncovering the truth behind today’s most pressing political issues. Launched in 2019, the show has grown exponentially, offering unfiltered insights, comprehensive research, and real-time analysis. With a commitment to independent journalism and factual integrity, the MG Show empowers its audience with knowledge and encourages active participation in the political discourse. French Your Way Jessica: Native French teacher founder of French Your Way Boost your French listening skills and test your comprehension with this one of a kind series of podcasts. Get the chance to listen to a real conversation between native speakers talking at normal speed AND customise your learning experience through carefully designed sets of questions (2 levels of difficulty) available for download at www.frenchvoicespodcast.com. All interviews also come with the transcript. French teacher Jessica interviews native speakers of French from around the world who share a bit of their life and passion. Where else would you meet in one same place a French yoga teacher based in Melbourne, a soap manufacturer from Provence, or a couple cycling around the world? That Hoarder: Overcome Compulsive Hoarding That Hoarder Hoarding disorder is stigmatised and people who hoard feel vast amounts of shame. This podcast began life as an audio diary, an anonymous outlet for somebody with this weird condition. That Hoarder speaks about her experiences living with compulsive hoarding, she interviews therapists, academics, researchers, children of hoarders, professional organisers and influencers, and she shares insight and tips for others with the problem. Listened to by people who hoard as well as those who love them and those who work with them, Overcome Compulsive Hoarding with That Hoarder aims to shatter the stigma, share the truth and speak openly and honestly to improve lives. The Small Business Startup School – Business Notes | Financial Literacy | Retail Psychology – For Professionals & Entrepreneurs The Small Business Startup School Inc. Starting or buying a small business? While personal circumstances may vary, business patterns remain timeless. On The Small Business Startup School, we explore strategies, insights, and practical solutions to help entrepreneurs confidently navigate their journey.Hosted by Ola Williams—a retail entrepreneur, fintech founder, and financial coach with over two decades of experience—this podcast marries financial awareness and retail psychology with optimism to deliver actionable takeaways.Join us to learn, grow, and connect as we uncover the keys to business success.Let’s continue to learn together and be encouraged to keep on connecting!

Frequently Asked Questions

How long is this episode of Super Data Science: ML & AI Podcast with Jon Krohn?

This episode is 57 minutes long.

When was this Super Data Science: ML & AI Podcast with Jon Krohn episode published?

This episode was published on June 11, 2024.

What is this episode about?

Reinforcement learning through human feedback (RLHF) has come a long way. In this episode, research scientist Nathan Lambert talks to Jon Krohn about the technique’s origins of the technique. He also walks through other ways to fine-tune LLMs, and...

Can I download this Super Data Science: ML & AI Podcast with Jon Krohn episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!