EPISODE · May 7, 2026 · 9 MIN
“Mechanistic estimation for wide random MLPs” by Jacob_Hilton
This post covers joint work with Wilson Wu, George Robinson, Mike Winer, Victor Lecomte and Paul Christiano. Thanks to Geoffrey Irving and Jess Riedel for comments on the post. In ARC's latest paper, we study the following problem: given a randomly initialized multilayer perceptron (MLP), produce an estimate for the expected output of the model under Gaussian input. The usual approach to this problem is to sample many possible inputs, run them all through the model, and take the average. Instead, we produce an estimate "mechanistically", without running the model even once. For wide models, our approach produces more accurate estimates, both in theory and in practice. Paper: Estimating the expected output of wide random MLPs more efficiently than sampling Code: mlp_cumulant_propagation GitHub repo We are excited about this result as an early step towards our goal of producing mechanistic estimates that outperform random sampling for any trained neural network. Drawing an analogy between this goal and a proof by induction, we see this result as (part of) the "base case": handling networks at initialization. We have a vision for the "inductive step", although we expect that to be much more difficult. Summary of results [...] ---Outline:(01:29) Summary of results(04:39) Significance of results(07:18) Extending to trained networks(08:36) Conclusion The original text contained 18 footnotes which were omitted from this narration. --- First published: May 7th, 2026 Source: https://www.lesswrong.com/posts/fsG4m6sRMpomd7Rk6/mechanistic-estimation-for-wide-random-mlps --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
NOW PLAYING
“Mechanistic estimation for wide random MLPs” by Jacob_Hilton
No transcript for this episode yet
Similar Episodes
Dec 20, 2021 ·0m