Ian Osband episode artwork

EPISODE · Mar 7, 2024 · 1H 8M

Ian Osband

from TalkRL: The Reinforcement Learning Podcast · host Robin Ranjit Singh Chauhan

Ian Osband is a Research scientist at OpenAI (ex DeepMind, Stanford) working on decision making under uncertainty.  We spoke about: - Information theory and RL - Exploration, epistemic uncertainty and joint predictions - Epistemic Neural Networks and scaling to LLMs Featured References  Reinforcement Learning, Bit by Bit  Xiuyuan Lu, Benjamin Van Roy, Vikranth Dwaracherla, Morteza Ibrahimi, Ian Osband, Zheng Wen  From Predictions to Decisions: The Importance of Joint Predictive Distributions Zheng Wen, Ian Osband, Chao Qin, Xiuyuan Lu, Morteza Ibrahimi, Vikranth Dwaracherla, Mohammad Asghari, Benjamin Van Roy   Epistemic Neural Networks Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy  Approximate Thompson Sampling via Epistemic Neural Networks Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy   Additional References  Thesis defence, Ian Osband Homepage, Ian Osband Epistemic Neural Networks at Stanford RL Forum Behaviour Suite for Reinforcement Learning, Osband et al 2019 Efficient Exploration for LLMs, Dwaracherla et al 2024 

Ian Osband is a Research scientist at OpenAI (ex DeepMind, Stanford) working on decision making under uncertainty.  We spoke about: - Information theory and RL - Exploration, epistemic uncertainty and joint predictions - Epistemic Neural Networks and scaling to LLMs Featured References  Reinforcement Learning, Bit by Bit  Xiuyuan Lu, Benjamin Van Roy, Vikranth Dwaracherla, Morteza Ibrahimi, Ian Osband, Zheng Wen  From Predictions to Decisions: The Importance of Joint Predictive Distributions Zheng Wen, Ian Osband, Chao Qin, Xiuyuan Lu, Morteza Ibrahimi, Vikranth Dwaracherla, Mohammad Asghari, Benjamin Van Roy   Epistemic Neural Networks Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy  Approximate Thompson Sampling via Epistemic Neural Networks Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy   Additional References  Thesis defence, Ian Osband Homepage, Ian Osband Epistemic Neural Networks at Stanford RL Forum Behaviour Suite for Reinforcement Learning, Osband et al 2019 Efficient Exploration for LLMs, Dwaracherla et al 2024

NOW PLAYING

Ian Osband

0:00 1:08:26

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

Frequently Asked Questions

How long is this episode of TalkRL: The Reinforcement Learning Podcast?

This episode is 1 hour and 8 minutes long.

When was this TalkRL: The Reinforcement Learning Podcast episode published?

This episode was published on March 7, 2024.

What is this episode about?

Ian Osband is a Research scientist at OpenAI (ex DeepMind, Stanford) working on decision making under uncertainty.  We spoke about: - Information theory and RL - Exploration, epistemic uncertainty and joint predictions - Epistemic Neural Networks...

Is there a transcript available for this episode?

Yes, a full transcript is available for this episode. You can read the complete transcript on the episode page.

Can I download this TalkRL: The Reinforcement Learning Podcast episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!