just now

Bottle Caps Aren't Optimisers by DanielFilan

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Bottle Caps Aren't Optimisers, published by DanielFilan on the AI Alignment Forum. Crosspost...

First published

12/04/2021

Genres:

education

Listen to this episode

0:00 / 0:00

Summary

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Bottle Caps Aren't Optimisers, published by DanielFilan on the AI Alignment Forum. Crossposted from my blog. One thing I worry about sometimes is people writing code with optimisers in it, without realising that that's what they were doing. An example of this: suppose you were doing deep reinforcement learning, doing optimisation to select a controller (that is, a neural network that takes a percept and returns an action) that generated high reward in some environment. Alas, unknown to you, this controller actually did optimisation itself to select actions that score well according to some metric that so far has been closely related to your reward function. In such a scenario, I'd be wary about your deploying that controller, since the controller itself is doing optimisation which might steer the world into a weird and unwelcome place. In order to avoid such scenarios, it would be nice if one could look at an algorithm and determine if it was doing optimisation. Ideally, this would involve an objective definition of optimisation that could be checked from the source code of the algorithm, rather than something like "an optimiser is a system whose behaviour can't usefully be predicted mechanically, but can be predicted by assuming it near-optimises some objective function", since such a definition breaks down when you have the algorithm's source code and can compute its behaviour mechanically. You might think about optimisation as follows: a system is optimising some objective function to the extent that that objective function attains much higher values than would be attained if the system didn't exist, or were doing some other random thing. This type of definition includes those put forward by Yudkowsky and Oesterheld. However, I think there are crucial counterexamples to this style of definition. Firstly, consider a lid screwed onto a bottle of water. If not for this lid, or if the lid had a hole in it or were more loose, the water would likely exit the bottle via evaporation or being knocked over, but with the lid, the water stays in the bottle much more reliably than otherwise. As a result, you might think that the lid is optimising the water remaining inside the bottle. However, I claim that this is not the case: the lid is just a rigid object designed by some optimiser that wanted water to remain inside the bottle. This isn't an incredibly compelling counterexample, since it doesn't qualify as an optimiser according to Yudkowsky's definition: it can be more simply described as a rigid object of a certain shape than an optimiser, so it isn't an optimiser. I am somewhat uncomfortable with this move (surely systems that are sub-optimal in complicated ways that are easily predictable by their source code should still count as optimisers?), but it's worth coming up with another counterexample to which this objection won't apply. Secondly, consider my liver. It's a complex physical system that's hard to describe, but if it were absent or behaved very differently, my body wouldn't work, I wouldn't remain alive, and I wouldn't be able to make any money, meaning that my bank account balance would be significantly lower than it is. In fact, subject to the constraint that the rest of my body works in the way that it actually works, it's hard to imagine what my liver could do which would result in a much higher bank balance. Nevertheless, it seems wrong to say that my liver is optimising my bank balance, and more right to say that it "detoxifies various metabolites, synthesizes proteins, and produces biochemicals necessary for digestion"---even though that gives a less precise account of the liver's behaviour. In fact, my liver's behaviour has something to do with optimising my income: it was created by evolution, which was sort of an optimisation process for agents that r...

Duration

Parent Podcast

The Nonlinear Library: Alignment Forum Top Posts

View Podcast

Share this episode

Similar Episodes

    AMA: Paul Christiano, alignment researcher by Paul Christiano

    Release Date: 12/06/2021

    Description: Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AMA: Paul Christiano, alignment researcher, published by Paul Christiano on the AI Alignment Forum. I'll be running an Ask Me Anything on this post from Friday (April 30) to Saturday (May 1). If you want to ask something just post a top-level comment; I'll spend at least a day answering questions. You can find some background about me here. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.

    Explicit: No

    Would an option to publish to AF users only be a useful feature?Q by Richard Ngo

    Release Date: 11/17/2021

    Description: Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Would an option to publish to AF users only be a useful feature?Q , published by Richard Ngo on the AI Alignment Forum. Right now there are quite a few private safety docs floating around. There's evidently demand for a privacy setting lower than "only people I personally approve", but higher than "anyone on the internet gets to see it". But this means that safety researchers might not see relevant arguments and information. And as the field grows, passing on access to such documents on a personal basis will become even less efficient. My guess is that in most cases, the authors of these documents don't have a problem with other safety researchers seeing them, as long as everyone agrees not to distribute them more widely. One solution could be to have a checkbox for new posts which makes them only visible to verified Alignment Forum users. Would people use this? Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.

    Explicit: No

    What is the alternative to intent alignment called? Q by Richard Ngo

    Release Date: 11/17/2021

    Description: Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: What is the alternative to intent alignment called? Q, published by Richard Ngo on the AI Alignment Forum. Paul defines intent alignment of an AI A to a human H as the criterion that A is trying to do what H wants it to do. What term do people use for the definition of alignment in which A is trying to achieve H's goals (whether or not H intends for A to achieve H's goals)? Secondly, this seems to basically map on to the distinction between an aligned genie and an aligned sovereign. Is this a fair characterisation? (Intent alignment definition from) Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.

    Explicit: No

    Welcome & FAQ! by Ruben Bloom, Oliver Habryka

    Release Date: 12/05/2021

    Description: Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Welcome & FAQ!, published by by Ruben Bloom, Oliver Habryka on the AI Alignment Forum. The AI Alignment Forum was launched in 2018. Since then, several hundred researchers have contributed approximately two thousand posts and nine thousand comments. Nearing the third birthday of the Forum, we are publishing this updated and clarified FAQ. Minimalist, watercolor sketch of humanity spreading across the stars by VQGAN I have a practical question concerning a site feature. Almost all of the Alignment Forum site features are shared with LessWrong.com; have a look at the LessWrong FAQ for questions concerning the Editor, Voting, Questions, Notifications & Subscriptions, Moderation, and more. If you can’t easily find the answer there, ping us on Intercom (bottom right of screen) or email us at [email protected] What is the AI Alignment Forum? The Alignment Forum is a single online hub for researchers to discuss all ideas related to ensuring that transformatively powerful AIs are aligned with human values. Discussion ranges from technical models of agency to the strategic landscape, and everything in between. Top voted posts include What failure looks like, Are we in an AI overhang?, and Embedded Agents. A list of the top posts of all time can be viewed here. While direct participation in the Forum is limited to deeply established researchers in the field, we have designed it also as a place where up-and-coming researchers can get up to speed on the research paradigms and have pathways to participation too. See How can non-members participate in the Forum? below. We hope that by being the foremost discussion platform and publication destination for AI Alignment discussion, the Forum will serve as the archive and library of the field. To find posts by sub-topic, view the AI section of the Concepts page. Why was the Alignment Forum created? Foremost, because misaligned powerful AIs may pose the greatest risk to our civilization that has ever arisen. The problem is of unknown (or at least unagreed upon) difficulty, and allowing the researchers in the field to better communicate and share their thoughts seems like one of the best things we could do to help the pre-paradigmatic field. In the past, journals or conferences might have been the best methods for increasing discussion and collaboration, but in the current age we believe that a well-designed online forum with things like immediate publication, distributed rating of quality (i.e. “peer review”), portability/shareability (e.g. via links), etc., provides the most promising way for the field to develop good standards and methodologies. A further major benefit of having alignment content and discussion in one easily accessible place is that it helps new researchers get onboarded to the field. Hopefully, this will help them begin contributing sooner. Who is the AI Alignment Forum for? There exists an interconnected community of Alignment researchers in industry, academia, and elsewhere who have spent many years thinking carefully about a variety of approaches to alignment. Such research receives institutional support from organizations including FHI, CHAI, DeepMind, OpenAI, MIRI, Open Philanthropy, ARC, and others. The Alignment Forum membership currently consists of researchers at these organizations and their respective collaborators. The Forum is also intended to be a way to interact with and contribute to the cutting edge research for people not connected to these institutions either professionally or socially. There have been many such individuals on LessWrong, and that is the current best place for such people to start contributing, to be given feedback and to skill-up in this domain. There are about 50-100 members of the Forum who are (1) able to post and comment directly to the Forum without review, (2) able to promo...

    Explicit: No

Similar Podcasts

    The Nonlinear Library

    Release Date: 10/07/2021

    Authors: The Nonlinear Fund

    Description: The Nonlinear Library allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs. To find out more, please visit us at nonlinear.org

    Explicit: No

    Effective Altruism Forum Podcast

    Release Date: 07/17/2021

    Authors: Garrett Baker

    Description: I (and hopefully many others soon) read particularly interesting or impactful posts from the EA forum.

    Explicit: No

    This Week in /r/reactjs

    Release Date: 09/07/2020

    Authors: This Week in /r/reactjs

    Description: top posts of the week in <10minutes!

    Explicit: No

    Two Librarians & A Microphone

    Release Date: 09/07/2020

    Authors: Ingram Library Services

    Description: ABOUT LIBRARY INNOVATIONS:&nbsp;This season, inspired by the Urban Library Council's Annual Forum theme, Innovations, focuses on the 2019 Top Innovator and Honorable Mention award winners that were selected based on inventiveness of their library program or service, outcomes achieved and the accessibility for other libraries to adopt and adapt the innovation in their library. Tune in to hear firsthand how libraries are expanding the boundaries of what is possible for 21st-century library and their communities.

    Explicit: No

    Readit Reddit

    Release Date: 08/30/2020

    Description: Reading top posts from Reddit!

    Explicit: No

    What's Good, Reddit?

    Release Date: 03/29/2021

    Authors: "What's Good, Reddit!" Crew

    Description: The top Reddit posts and comments from the top subreddits read aloud to you on a weekly basis

    Explicit: No

    What's New at Liberty Middle School

    Release Date: 08/23/2020

    Authors: Susan Martin

    Description: What's New posts podcasts from Liberty Middle School's library. These podcasts review books of interest to middle schoolers.

    Explicit: No

    sasodgy

    Release Date: 04/14/2021

    Description: Audio Recordings from the Students Against Sexual Orientation Discrimination (SASOD) Public Forum with Members of Parliament at the National Library in Georgetown, Guyana

    Explicit: No

    Sista's With Crosses

    Release Date: 10/05/2021

    Authors: Sista

    Description: Sista's With Crosses provides a forum where women of different ages, realms, and demographics bring truth and light to the perspectives of social media posts.

    Explicit: No

    Top Shelf at the Merrick Library

    Release Date: 08/31/2020

    Authors: Carol Ann Tack

    Description: Top Shelf at the Merrick Library is a podcast of all things books. Everything you'll need to stock your TOP SHELF!

    Explicit: No

    Great Falls Forum

    Release Date: 08/22/2020

    Authors: Sun Journal

    Description: The Great Falls Forum is a monthly, speaker series in Maine featuring statewide and regional leaders in public policy, business, academia and the arts. The forum is a co-sponsorship of the Sun Journal, Bates College and the Lewiston (Maine) Public Library. http://www.sunjournal.com

    Explicit: No

    The Marketing Detective

    Release Date: 09/25/2020

    Authors: Mitch West

    Description: Solving the marketing mysteries that challenge local business across linear and nonlinear platforms.

    Explicit: No