EPISODE · Feb 18, 2021 · 1H 32M
BI 098 Brian Christian: The Alignment Problem
from Brain Inspired · host Paul Middlebrooks
Brian and I discuss a range of topics related to his latest book, The Alignment Problem: Machine Learning and Human Values. The alignment problem asks how we can build AI that does what we want it to do, as opposed to building AI that will compromise our own values by accomplishing tasks that may be harmful or dangerous to us. Using some of the stories Brain relates in the book, we talk about: The history of machine learning and how we got this point;Some methods researches are creating to understand what's being represented in neural nets and how they generate their output;Some modern proposed solutions to the alignment problem, like programming the machines to learn our preferences so they can help achieve those preferences - an idea called inverse reinforcement learning;The thorny issue of accurately knowing our own values- if we get those wrong, will machines also get it wrong? Links: Brian's website.Twitter: @brianchristian.The Alignment Problem: Machine Learning and Human Values.Related papersNorbert Wiener from 1960: Some Moral and Technical Consequences of Automation. Timestamps: 4:22 - Increased work on AI ethics 8:59 - The Alignment Problem overview 12:36 - Stories as important for intelligence 16:50 - What is the alignment problem 17:37 - Who works on the alignment problem? 25:22 - AI ethics degree? 29:03 - Human values 31:33 - AI alignment and evolution 37:10 - Knowing our own values? 46:27 - What have learned about ourselves? 58:51 - Interestingness 1:00:53 - Inverse RL for value alignment 1:04:50 - Current progress 1:10:08 - Developmental psychology 1:17:36 - Models as the danger 1:25:08 - How worried are the experts?
What this episode covers
Brian and I discuss a range of topics related to his latest book, The Alignment Problem: Machine Learning and Human Values. The alignment problem asks how we can build AI that does what we want it to do, as opposed to building AI that will compromise our own values by accomplishing tasks that may be harmful or dangerous to us. Using some of the stories Brain relates in the book, we talk about: The history of machine learning and how we got this point;Some methods researches are creating to understand what's being represented in neural nets and how they generate their output;Some modern proposed solutions to the alignment problem, like programming the machines to learn our preferences so they can help achieve those preferences - an idea called inverse reinforcement learning;The thorny issue of accurately knowing our own values- if we get those wrong, will machines also get it wrong? Links: Brian's website.Twitter: @brianchristian.The Alignment Problem: Machine Learning and Human Values.Related papersNorbert Wiener from 1960: Some Moral and Technical Consequences of Automation. Timestamps: 4:22 - Increased work on AI ethics 8:59 - The Alignment Problem overview 12:36 - Stories as important for intelligence 16:50 - What is the alignment problem 17:37 - Who works on the alignment problem? 25:22 - AI ethics degree? 29:03 - Human values 31:33 - AI alignment and evolution 37:10 - Knowing our own values? 46:27 - What have learned about ourselves? 58:51 - Interestingness 1:00:53 - Inverse RL for value alignment 1:04:50 - Current progress 1:10:08 - Developmental psychology 1:17:36 - Models as the danger 1:25:08 - How worried are the experts?
NOW PLAYING
BI 098 Brian Christian: The Alignment Problem
No transcript for this episode yet
Similar Episodes
Mar 13, 2025 ·16m
Dec 16, 2024 ·20m
Jul 19, 2024 ·17m
Jun 27, 2024 ·26m
Jun 12, 2024 ·30m
Jun 4, 2024 ·20m