John Schulman episode artwork

EPISODE · Oct 18, 2022 · 44 MIN

John Schulman

from TalkRL: The Reinforcement Learning Podcast · host Robin Ranjit Singh Chauhan

John Schulman is a cofounder of OpenAI, and currently a researcher and engineer at OpenAI.Featured ReferencesWebGPT: Browser-assisted question-answering with human feedbackReiichiro Nakano, Jacob Hilton, Suchir Balaji, Jeff Wu, Long Ouyang, Christina Kim, Christopher Hesse, Shantanu Jain, Vineet Kosaraju, William Saunders, Xu Jiang, Karl Cobbe, Tyna Eloundou, Gretchen Krueger, Kevin Button, Matthew Knight, Benjamin Chess, John SchulmanTraining language models to follow instructions with human feedbackLong Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, Ryan LoweAdditional ReferencesOur approach to alignment research, OpenAI 2022Training Verifiers to Solve Math Word Problems, Cobbe et al 2021UC Berkeley Deep RL Bootcamp Lecture 6: Nuts and Bolts of Deep RL Experimentation, John Schulman 2017Proximal Policy Optimization Algorithms, Schulman 2017Optimizing Expectations: From Deep Reinforcement Learning to Stochastic Computation Graphs, Schulman 2016

John Schulman, OpenAI cofounder and researcher, inventor of PPO/TRPO talks RL from human feedback, tuning GPT-3 to follow instructions (InstructGPT) and answer long-form questions using the internet (WebGPT), AI alignment, AGI timelines, and more!

NOW PLAYING

John Schulman

0:00 44:21

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

Frequently Asked Questions

How long is this episode of TalkRL: The Reinforcement Learning Podcast?

This episode is 44 minutes long.

When was this TalkRL: The Reinforcement Learning Podcast episode published?

This episode was published on October 18, 2022.

What is this episode about?

John Schulman is a cofounder of OpenAI, and currently a researcher and engineer at OpenAI.Featured ReferencesWebGPT: Browser-assisted question-answering with human feedbackReiichiro Nakano, Jacob Hilton, Suchir Balaji, Jeff Wu, Long Ouyang,...

Is there a transcript available for this episode?

Yes, a full transcript is available for this episode. You can read the complete transcript on the episode page.

Can I download this TalkRL: The Reinforcement Learning Podcast episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!