EPISODE · Jun 17, 2026 · 2 MIN
[Linkpost] “Scaling Hypothesis #2: Are Humans Just More Over-Parameterized?” by gwern
This is a link post. There are many mysteries about deep learning and human intelligence, but we could describe the biggest anomaly this way: why are artificial neural nets smart in such stupid ways, and biological brains stupid but in smart ways? I propose a major change in deep learning scaling paradigms: the architectural differences between human brains and NNs (particularly LLMs) may be due to a bias-variance tradeoff, where LLMs minimize variance and human brains minimize bias. Human brains do this by deep double descent-style overparameterization, and adopting a scaling strategy of extremely high-learning-rate training of extremely overparameterized models on small diverse highly-filtered datasets. This approach would lead to sample-efficiently and compute-efficiently traveling (or catapulting) to a highly-generalizing human-like basin in the model loss landscape, while performing poorly up until the end and failing to memorize much data. If true, this would explain a number of odd stylized facts about how humans/NNs perform well/poorly. Such a 'catapulted LLM' would generalize much better than existing NNs, be immune to adversarial attacks, have better economics and be more resistant to cloning, could potentially enable extremely efficient MLP architectures, and by giving true generalization, provide a sturdy foundation for [...] --- First published: June 17th, 2026 Source: https://www.lesswrong.com/posts/Eg7caxofhxZGnhgBD/scaling-hypothesis-2-are-humans-just-more-over-parameterized Linkpost URL:https://gwern.net/llm-catapult --- Narrated by TYPE III AUDIO.
NOW PLAYING
[Linkpost] “Scaling Hypothesis #2: Are Humans Just More Over-Parameterized?” by gwern
No transcript for this episode yet
Similar Episodes
Dec 20, 2021 ·0m