AF - What I would do if I wasn't at ARC Evals by Lawrence Chan

<a href="https://www.alignmentforum.org/posts/6FkWnktH3mjMAxdRT/what-i-would-do-if-i-wasn-t-at-arc-evals">Link to original article</a><br/><br/>Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: What I would do if I wasn't at ARC Evals, published by Lawrence Chan on September 5, 2023 on The AI Alignment Forum. In which: I list 9 projects that I would work on if I wasn't busy working on safety standards at ARC Evals, and explain why they might be good to work on. Epistemic status: I'm prioritizing getting this out fast as opposed to writing it carefully. I've thought for at least a few hours and talked to a few people I trust about each of the following projects, but I haven't done that much digging into each of these, and it's likely that I'm wrong about many material facts. I also make little claim to the novelty of the projects. I'd recommend looking into these yourself before committing to doing them. (Total time spent writing or editing this post: ~8 hours.) Standard disclaimer: I'm writing this in my own capacity. The views expressed are my own, and should not be taken to represent the views of ARC/FAR/LTFF/Lightspeed or any other org or program I'm involved with. Thanks to Ajeya Cotra, Caleb Parikh, Chris Painter, Daniel Filan, Rachel Freedman, Rohin Shah, Thomas Kwa, and others for comments and feedback. Introduction I'm currently working as a researcher on the Alignment Research Center Evaluations Team (ARC Evals), where I'm working on lab safety standards. I'm reasonably sure that this is one of the most useful things I could be doing with my life. Unfortunately, there's a lot of problems to solve in the world, and lots of balls that are being dropped, that I don't have time to get to thanks to my day job. Here's an unsorted and incomplete list of projects that I would consider doing if I wasn't at ARC Evals: Ambitious mechanistic interpretability. Getting people to write papers/writing papers myself. Creating concrete projects and research agendas. Working on OP's funding bottleneck. Working on everyone else's funding bottleneck. Running the Long-Term Future Fund. Onboarding senior(-ish) academics and research engineers. Extending the young-EA mentorship pipeline. Writing blog posts/giving takes. I've categorized these projects into three broad categories and will discuss each in turn below. For each project, I'll also list who I think should work on them, as well as some of my key uncertainties. Note that this document isn't really written for myself to decide between projects, but instead as a list of some promising projects for someone with a similar skillset to me. As such, there's not much discussion of personal fit. If you're interested in working on any of the projects, please reach out or post in the comments below! Relevant beliefs I have Before jumping into the projects I think people should work on, I think it's worth outlining some of my core beliefs that inform my thinking and project selection: Importance of A(G)I safety: I think A(G)I Safety is one of the most important problems to work on, and all the projects below are thus aimed at AI Safety. Value beyond technical research: Technical AI Safety (AIS) research is crucial, but other types of work are valuable as well. Efforts aimed at improving AI governance, grantmaking, and community building are important and we should give more credit to those doing good work in those areas. High discount rate for current EA/AIS funding: There's several reasons for this: first, EA/AIS Funders are currently in a unique position due to a surge in AI Safety interest without a proportional increase in funding. I expect this dynamic to change and our influence to wane as additional funding and governments enter this space. Second, efforts today are important for paving the path to future efforts in the future. Third, my timelines are relatively short, which increases the importance of current funding. Building a robust EA/AIS ecosystem: The EA/AIS ecosystem should be more prepared for unpredictable s...

First published

09/05/2023

Genres:

education

Listen to this episode

0:00 / 0:00

Summary

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: What I would do if I wasn't at ARC Evals, published by Lawrence Chan on September 5, 2023 on The AI Alignment Forum. In which: I list 9 projects that I would work on if I wasn't busy working on safety standards at ARC Evals, and explain why they might be good to work on. Epistemic status: I'm prioritizing getting this out fast as opposed to writing it carefully. I've thought for at least a few hours and talked to a few people I trust about each of the following projects, but I haven't done that much digging into each of these, and it's likely that I'm wrong about many material facts. I also make little claim to the novelty of the projects. I'd recommend looking into these yourself before committing to doing them. (Total time spent writing or editing this post: ~8 hours.) Standard disclaimer: I'm writing this in my own capacity. The views expressed are my own, and should not be taken to represent the views of ARC/FAR/LTFF/Lightspeed or any other org or program I'm involved with. Thanks to Ajeya Cotra, Caleb Parikh, Chris Painter, Daniel Filan, Rachel Freedman, Rohin Shah, Thomas Kwa, and others for comments and feedback. Introduction I'm currently working as a researcher on the Alignment Research Center Evaluations Team (ARC Evals), where I'm working on lab safety standards. I'm reasonably sure that this is one of the most useful things I could be doing with my life. Unfortunately, there's a lot of problems to solve in the world, and lots of balls that are being dropped, that I don't have time to get to thanks to my day job. Here's an unsorted and incomplete list of projects that I would consider doing if I wasn't at ARC Evals: Ambitious mechanistic interpretability. Getting people to write papers/writing papers myself. Creating concrete projects and research agendas. Working on OP's funding bottleneck. Working on everyone else's funding bottleneck. Running the Long-Term Future Fund. Onboarding senior(-ish) academics and research engineers. Extending the young-EA mentorship pipeline. Writing blog posts/giving takes. I've categorized these projects into three broad categories and will discuss each in turn below. For each project, I'll also list who I think should work on them, as well as some of my key uncertainties. Note that this document isn't really written for myself to decide between projects, but instead as a list of some promising projects for someone with a similar skillset to me. As such, there's not much discussion of personal fit. If you're interested in working on any of the projects, please reach out or post in the comments below! Relevant beliefs I have Before jumping into the projects I think people should work on, I think it's worth outlining some of my core beliefs that inform my thinking and project selection: Importance of A(G)I safety: I think A(G)I Safety is one of the most important problems to work on, and all the projects below are thus aimed at AI Safety. Value beyond technical research: Technical AI Safety (AIS) research is crucial, but other types of work are valuable as well. Efforts aimed at improving AI governance, grantmaking, and community building are important and we should give more credit to those doing good work in those areas. High discount rate for current EA/AIS funding: There's several reasons for this: first, EA/AIS Funders are currently in a unique position due to a surge in AI Safety interest without a proportional increase in funding. I expect this dynamic to change and our influence to wane as additional funding and governments enter this space. Second, efforts today are important for paving the path to future efforts in the future. Third, my timelines are relatively short, which increases the importance of current funding. Building a robust EA/AIS ecosystem: The EA/AIS ecosystem should be more prepared for unpredictable s...

Duration

21 minutes

Parent Podcast

The Nonlinear Library: Alignment Forum Weekly

View Podcast

Share this episode

Similar Episodes

AMA: Paul Christiano, alignment researcher by Paul Christiano

Release Date: 12/06/2021

Description: Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AMA: Paul Christiano, alignment researcher, published by Paul Christiano on the AI Alignment Forum. I'll be running an Ask Me Anything on this post from Friday (April 30) to Saturday (May 1). If you want to ask something just post a top-level comment; I'll spend at least a day answering questions. You can find some background about me here. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.

Explicit: No

Details

What is the alternative to intent alignment called? Q by Richard Ngo

Release Date: 11/17/2021

Description: Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: What is the alternative to intent alignment called? Q, published by Richard Ngo on the AI Alignment Forum. Paul defines intent alignment of an AI A to a human H as the criterion that A is trying to do what H wants it to do. What term do people use for the definition of alignment in which A is trying to achieve H's goals (whether or not H intends for A to achieve H's goals)? Secondly, this seems to basically map on to the distinction between an aligned genie and an aligned sovereign. Is this a fair characterisation? (Intent alignment definition from) Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.

Explicit: No

Details

AI alignment landscape by Paul Christiano

Release Date: 11/19/2021

Description: Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AI alignment landscape, published byPaul Christiano on the AI Alignment Forum. Here (link) is a talk I gave at EA Global 2019, where I describe how intent alignment fits into the broader landscape of “making AI go well,” and how my work fits into intent alignment. This is particularly helpful if you want to understand what I’m doing, but may also be useful more broadly. I often find myself wishing people were clearer about some of these distinctions. Here is the main overview slide from the talk: The highlighted boxes are where I spend most of my time. Here are the full slides from the talk. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.

Explicit: No

Details

Would an option to publish to AF users only be a useful feature?Q by Richard Ngo

Release Date: 11/17/2021

Description: Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Would an option to publish to AF users only be a useful feature?Q , published by Richard Ngo on the AI Alignment Forum. Right now there are quite a few private safety docs floating around. There's evidently demand for a privacy setting lower than "only people I personally approve", but higher than "anyone on the internet gets to see it". But this means that safety researchers might not see relevant arguments and information. And as the field grows, passing on access to such documents on a personal basis will become even less efficient. My guess is that in most cases, the authors of these documents don't have a problem with other safety researchers seeing them, as long as everyone agrees not to distribute them more widely. One solution could be to have a checkbox for new posts which makes them only visible to verified Alignment Forum users. Would people use this? Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.

Explicit: No

Details