EPISODE · May 20, 2026 · 2 MIN
“theory uplift differentially benefits safety & is massively underpriced” by Yudhister Kumar
[1] We will likely have near-superhuman mathematics AI by Q1 2027. [1] [2] Qualitatively, AI mathematics capabilities are developing significantly faster than automated AI R&D capabilities. [2] [3] Thus, we will likely have a period of time where the rate of our ability to rigorously & usefully verify and understand model behavior and model outputs outpaces the rate of capability development itself. [4] Our ability to take advantage of this period is bottlenecked on the quality of our specification generation infrastructure, elicitation tooling (for proofs & specs etc.), and the institutional capacity for scaling useful outputs with capital. [5] My understanding is that basically no one [3] is working on building infra that can usefully turn >100 million dollars of compute credits into safety-relevant mathematical output. [5.1] The number of theory-driven ASI alignment efforts is also comparatively miniscule. ARC is a much better bet now than it was in 2023. [5.2]. My understanding is also that no one is working on developing AI-powered conceptual tooling infrastructure for tackling problems in, for instance, [metaphilosophy] (https://www.alignmentforum.org/posts/EByDsY9S3EDhhfFzC/some-thoughts-on-metaphilosophy). This is a much harder problem. [6] In worlds where alignment is easy, prosaic methods may [...] The original text contained 3 footnotes which were omitted from this narration. --- First published: May 20th, 2026 Source: https://www.lesswrong.com/posts/KWeAYcDJwfrG7RwBN/theory-uplift-differentially-benefits-safety-and-is --- Narrated by TYPE III AUDIO.
NOW PLAYING
“theory uplift differentially benefits safety & is massively underpriced” by Yudhister Kumar
No transcript for this episode yet
Similar Episodes
Dec 20, 2021 ·0m