just now

AF - A rough and incomplete review of some of John Wentworth's research by Nate Soares

<a href="https://www.alignmentforum.org/posts/mgjHS6ou7DgwhKPpu/a-rough-and-incomplete-review-of-some-of-john-wentworth-s">Link to original article</a><br/><br/>Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: A rough and incomplete review of some of John Wentworth's research, published by Nate Soares on March 28, 2023 on The AI Alignment Forum. This is going to be a half-assed review of John Wentworth's research. I studied his work last year, and was kinda hoping to write up a better review, but am lowering my standards on account of how that wasn't happening. Short version: I've been unimpressed by John's technical ideas related to the natural abstractions hypothesis. He seems to me to have some fine intuitions, and to possess various laudable properties such as a vision for solving the whole dang problem and the ability to consider that everybody else is missing something obvious. That said, I've found his technical ideas to be oversold and underwhelming whenever I look closely. (By my lights, Lawrence Chan, Leon Lang, and Erik Jenner’s recent post on natural abstractions is overall better than this post, being more thorough and putting a finger more precisely on various fishy parts of John's math. I'm publishing this draft anyway because my post adds a few points that I think are also useful (especially in the section “The Dream”).) To cite a specific example of a technical claim of John's that does not seem to me to hold up under scrutiny: John has previously claimed that markets are a better model of intelligence than agents, because while collective agents don't have preference cycles, they're willing to pass up certain gains. For example, if an alien tries to sell a basket "Alice loses $1, Bob gains $3", then the market will refuse (because Alice will refuse); and if the alien then switches to selling "Alice gains $3, Bob loses $1" then the market will refuse (because Bob will refuse); but now a certain gain has been passed over. This argument seems straightforwardly wrong to me, as summarized in a stylized dialogue I wrote (that includes more details about the point). If Alice and Bob are sufficiently capable reasoners then they take both trades and even things out using a side channel. (And even if they don't have a side channel, there are positive-EV contracts they can enter into in advance before they know who will be favored. And if they reason using LDT, they ofc don't need to sign contracts in advance.) (Aside: A bunch of the difficult labor in evaluating technical claims is in the part where you take a high-falutin' abstract thing like "markets are a better model of intelligence than agents" and pound on it until you get a specific minimal example like "neither of the alien's baskets is accepted by a market consisting of two people named Alice and Bob", at which point the error becomes clear. I haven't seen anybody else do that sort of distillation with John's claims. It seems to me that our community has a dearth of this kind of distillation work. If you're eager to do alignment work, don't know how to help, and think you can do some of this sort of distillation, I recommend trying. MATS might be able to help out.) I pointed this out to John, and (to John's credit) he seemed to update (in realtime, which is rare) ((albeit with a caveat that communicating the point took a while, and didn't transmit the first few times that I tried to say it abstractly before having done the distillation labor)). The dialogue I wrote recounting that convo is probably not an entirely unfair summary (John said "there was not any point at which I thought my views were importantly misrepresented" when I asked him for comment). My impression of John's other technical claims about natural abstractions is that they have similar issues. That said, I don't have nearly so crisp a distillation of John's views on natural abstractions, nor nearly so short a refutation. I spent a significant amount of time looking into John’s relevant views (we had overlapping travel plans and conspired t...

First published

03/28/2023

Genres:

education

Listen to this episode

0:00 / 0:00

Summary

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: A rough and incomplete review of some of John Wentworth's research, published by Nate Soares on March 28, 2023 on The AI Alignment Forum. This is going to be a half-assed review of John Wentworth's research. I studied his work last year, and was kinda hoping to write up a better review, but am lowering my standards on account of how that wasn't happening. Short version: I've been unimpressed by John's technical ideas related to the natural abstractions hypothesis. He seems to me to have some fine intuitions, and to possess various laudable properties such as a vision for solving the whole dang problem and the ability to consider that everybody else is missing something obvious. That said, I've found his technical ideas to be oversold and underwhelming whenever I look closely. (By my lights, Lawrence Chan, Leon Lang, and Erik Jenner’s recent post on natural abstractions is overall better than this post, being more thorough and putting a finger more precisely on various fishy parts of John's math. I'm publishing this draft anyway because my post adds a few points that I think are also useful (especially in the section “The Dream”).) To cite a specific example of a technical claim of John's that does not seem to me to hold up under scrutiny: John has previously claimed that markets are a better model of intelligence than agents, because while collective agents don't have preference cycles, they're willing to pass up certain gains. For example, if an alien tries to sell a basket "Alice loses $1, Bob gains $3", then the market will refuse (because Alice will refuse); and if the alien then switches to selling "Alice gains $3, Bob loses $1" then the market will refuse (because Bob will refuse); but now a certain gain has been passed over. This argument seems straightforwardly wrong to me, as summarized in a stylized dialogue I wrote (that includes more details about the point). If Alice and Bob are sufficiently capable reasoners then they take both trades and even things out using a side channel. (And even if they don't have a side channel, there are positive-EV contracts they can enter into in advance before they know who will be favored. And if they reason using LDT, they ofc don't need to sign contracts in advance.) (Aside: A bunch of the difficult labor in evaluating technical claims is in the part where you take a high-falutin' abstract thing like "markets are a better model of intelligence than agents" and pound on it until you get a specific minimal example like "neither of the alien's baskets is accepted by a market consisting of two people named Alice and Bob", at which point the error becomes clear. I haven't seen anybody else do that sort of distillation with John's claims. It seems to me that our community has a dearth of this kind of distillation work. If you're eager to do alignment work, don't know how to help, and think you can do some of this sort of distillation, I recommend trying. MATS might be able to help out.) I pointed this out to John, and (to John's credit) he seemed to update (in realtime, which is rare) ((albeit with a caveat that communicating the point took a while, and didn't transmit the first few times that I tried to say it abstractly before having done the distillation labor)). The dialogue I wrote recounting that convo is probably not an entirely unfair summary (John said "there was not any point at which I thought my views were importantly misrepresented" when I asked him for comment). My impression of John's other technical claims about natural abstractions is that they have similar issues. That said, I don't have nearly so crisp a distillation of John's views on natural abstractions, nor nearly so short a refutation. I spent a significant amount of time looking into John’s relevant views (we had overlapping travel plans and conspired t...

Duration

30 minutes

Parent Podcast

The Nonlinear Library: Alignment Forum Weekly

View Podcast

Share this episode

Similar Episodes

    AMA: Paul Christiano, alignment researcher by Paul Christiano

    Release Date: 12/06/2021

    Description: Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AMA: Paul Christiano, alignment researcher, published by Paul Christiano on the AI Alignment Forum. I'll be running an Ask Me Anything on this post from Friday (April 30) to Saturday (May 1). If you want to ask something just post a top-level comment; I'll spend at least a day answering questions. You can find some background about me here. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.

    Explicit: No

    What is the alternative to intent alignment called? Q by Richard Ngo

    Release Date: 11/17/2021

    Description: Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: What is the alternative to intent alignment called? Q, published by Richard Ngo on the AI Alignment Forum. Paul defines intent alignment of an AI A to a human H as the criterion that A is trying to do what H wants it to do. What term do people use for the definition of alignment in which A is trying to achieve H's goals (whether or not H intends for A to achieve H's goals)? Secondly, this seems to basically map on to the distinction between an aligned genie and an aligned sovereign. Is this a fair characterisation? (Intent alignment definition from) Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.

    Explicit: No

    AI alignment landscape by Paul Christiano

    Release Date: 11/19/2021

    Description: Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AI alignment landscape, published byPaul Christiano on the AI Alignment Forum. Here (link) is a talk I gave at EA Global 2019, where I describe how intent alignment fits into the broader landscape of “making AI go well,” and how my work fits into intent alignment. This is particularly helpful if you want to understand what I’m doing, but may also be useful more broadly. I often find myself wishing people were clearer about some of these distinctions. Here is the main overview slide from the talk: The highlighted boxes are where I spend most of my time. Here are the full slides from the talk. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.

    Explicit: No

    Would an option to publish to AF users only be a useful feature?Q by Richard Ngo

    Release Date: 11/17/2021

    Description: Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Would an option to publish to AF users only be a useful feature?Q , published by Richard Ngo on the AI Alignment Forum. Right now there are quite a few private safety docs floating around. There's evidently demand for a privacy setting lower than "only people I personally approve", but higher than "anyone on the internet gets to see it". But this means that safety researchers might not see relevant arguments and information. And as the field grows, passing on access to such documents on a personal basis will become even less efficient. My guess is that in most cases, the authors of these documents don't have a problem with other safety researchers seeing them, as long as everyone agrees not to distribute them more widely. One solution could be to have a checkbox for new posts which makes them only visible to verified Alignment Forum users. Would people use this? Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.

    Explicit: No

Similar Podcasts

    The Nonlinear Library

    Release Date: 10/07/2021

    Authors: The Nonlinear Fund

    Description: The Nonlinear Library allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs. To find out more, please visit us at nonlinear.org

    Explicit: No

    The Nonlinear Library: Alignment Section

    Release Date: 02/10/2022

    Authors: The Nonlinear Fund

    Description: The Nonlinear Library allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs. To find out more, please visit us at nonlinear.org

    Explicit: No

    The Nonlinear Library: LessWrong

    Release Date: 03/03/2022

    Authors: The Nonlinear Fund

    Description: The Nonlinear Library allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs. To find out more, please visit us at nonlinear.org

    Explicit: No

    The Nonlinear Library: LessWrong Daily

    Release Date: 05/02/2022

    Authors: The Nonlinear Fund

    Description: The Nonlinear Library allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs. To find out more, please visit us at nonlinear.org

    Explicit: No

    The Nonlinear Library: EA Forum Daily

    Release Date: 05/02/2022

    Authors: The Nonlinear Fund

    Description: The Nonlinear Library allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs. To find out more, please visit us at nonlinear.org

    Explicit: No

    The Nonlinear Library: EA Forum Weekly

    Release Date: 05/02/2022

    Authors: The Nonlinear Fund

    Description: The Nonlinear Library allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs. To find out more, please visit us at nonlinear.org

    Explicit: No

    The Nonlinear Library: Alignment Forum Daily

    Release Date: 05/02/2022

    Authors: The Nonlinear Fund

    Description: The Nonlinear Library allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs. To find out more, please visit us at nonlinear.org

    Explicit: No

    The Nonlinear Library: LessWrong Weekly

    Release Date: 05/02/2022

    Authors: The Nonlinear Fund

    Description: The Nonlinear Library allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs. To find out more, please visit us at nonlinear.org

    Explicit: No

    The Nonlinear Library: Alignment Forum Top Posts

    Release Date: 02/10/2022

    Authors: The Nonlinear Fund

    Description: Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.

    Explicit: No

    The Nonlinear Library: LessWrong Top Posts

    Release Date: 02/15/2022

    Authors: The Nonlinear Fund

    Description: Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.

    Explicit: No

    sasodgy

    Release Date: 04/14/2021

    Description: Audio Recordings from the Students Against Sexual Orientation Discrimination (SASOD) Public Forum with Members of Parliament at the National Library in Georgetown, Guyana

    Explicit: No