EPISODE · Jun 5, 2026 · 12 MIN
“Learnings from starting an AI safety research team” by draganover, Erin Robertson
This post's goal is to distill our takeaways from building a research team (somewhat) from scratch over the past four months. We describe some context about our team, how it came about, and then provide some lessons learned. Since AI safety is becoming more and more entrepreneurial, we hope this is helpful for others trying to do the same. 1. The team We're a new alignment research team within Arcadia Impact, based in London. We’re a team of 8, working closely with members of the UK AISI alignment team. We currently have three main projects: Understanding model motivations. This currently looks like: Trying to generate documents which fully describe a model's behaviour (given just its behaviour).Producing a open analysis of alignment training techniques and ways this training could go wrong.Doing scalable oversight for alignment. This includes validating debate protocols in practice and then trying to apply them to fuzzy alignment-relevant tasks.Building pipelines for doing automated alignment research. We're also hiring for two roles! More on this at the bottom. 2. Context about how the team came about The rest of this post is written from the perspective of Andrew Draganov (research lead & current [...] ---Outline:(00:33) 1. The team(01:29) 2. Context about how the team came about(04:13) 3. Lessons learned(04:25) 3.1. Hiring(06:36) 3.2. Networking(09:13) 3.3. Trying to build a good team culture(11:17) Interested in working with us? The original text contained 1 footnote which was omitted from this narration. --- First published: June 5th, 2026 Source: https://www.lesswrong.com/posts/4onALBNDff2LFPyNZ/learnings-from-starting-an-ai-safety-research-team --- Narrated by TYPE III AUDIO.
NOW PLAYING
“Learnings from starting an AI safety research team” by draganover, Erin Robertson
No transcript for this episode yet
Similar Episodes
Dec 20, 2021 ·0m