EPISODE · Jun 9, 2026 · 19 MIN
“A Mike’s-Eye View of ARC’s Research” by Jacob_Hilton
Over the past 15 months or so, ARC's technical agenda has developed quite a bit. The advent of the Matching Sampling Principle (MSP), and ideas like it, has begotten a host of concrete technical problems; progress on those problems has given us more philosophical clarity on the big picture, which has led to even more technical progress. The two most recent public discussions of ARC's research (Jacob's A Bird's Eye View of ARC's Research and David's Obstacles in ARC's research agenda) both came out before this flywheel really got spinning, and a lot of what we now consider central to the agenda isn't reflected in either of them. The goal of this post is to give a clear, updated picture of what we're actually trying to do. This is written from my point of view; I don't speak for my whole organization. Here is ARC's hoped-for pipeline for aligning a powerful AI: monitor training to detect structure as it is added to the model; convert that structure into advice that improves an MSP-style mechanistic estimator of the model's behavior; use the resulting estimator, together with a description of the relevant input distribution, to estimate a safety-relevant quantity such [...] ---Outline:(03:23) Matching Sampling Principle(09:47) Identifying Structure / Plugging Structure into MSP(13:21) Dealing with Real Data(16:18) Aligned to What?(18:04) Mechanistic Anomaly Detection The original text contained 22 footnotes which were omitted from this narration. --- First published: June 9th, 2026 Source: https://www.lesswrong.com/posts/M2tD23bvQLBqsEpqu/a-mike-s-eye-view-of-arc-s-research --- Narrated by TYPE III AUDIO.
NOW PLAYING
“A Mike’s-Eye View of ARC’s Research” by Jacob_Hilton
No transcript for this episode yet
Similar Episodes
Dec 20, 2021 ·0m