Grokking, Generalization Collapse, and the Dynamics of Training Deep Neural Networks with Charles Martin - #734

from The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) · host Sam Charrington

Today, we're joined by Charles Martin, founder of Calculation Consulting, to discuss Weight Watcher, an open-source tool for analyzing and improving Deep Neural Networks (DNNs) based on principles from theoretical physics. We explore the foundations of the Heavy-Tailed Self-Regularization (HTSR) theory that underpins it, which combines random matrix theory and renormalization group ideas to uncover deep insights about model training dynamics. Charles walks us through WeightWatcher’s ability to detect three distinct learning phases—underfitting, grokking, and generalization collapse—and how its signature “layer quality” metric reveals whether individual layers are underfit, overfit, or optimally tuned. Additionally, we dig into the complexities involved in fine-tuning models, the surprising correlation between model optimality and hallucination, the often-underestimated challenges of search relevance, and their implications for RAG. Finally, Charles shares his insights into real-world applications of generative AI and his lessons learned from working in the field. The complete show notes for this episode can be found at https://twimlai.com/go/734.

What this episode covers

NOW PLAYING

0:00 1:25:21

1×

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

Share this episode

Similar Episodes

I'm ok

Mar 26, 2026 ·1m

Food Saved My Life

Mar 19, 2026 ·34m

Eat More Vegetables: The 4 Foods That Beat Ozempic (Naturally)

Feb 18, 2026 ·11m

How to End Heart Disease with Dr. Fuhrman

Feb 11, 2026 ·45m

Revolutionizing Breast Health: QT Imaging, Overdiagnosis, and What to Do Instead

Jan 27, 2026 ·35m

REMIX: Why we over-shop and compulsively acquire, and how to stop, with Dr Jan Eppingstall

Jan 9, 2026 ·61m

Similar Podcasts

MG Show MG Show The MG Show, hosted by Jeffrey Pedersen and Shannon Townsend, is a leading alternative media platform dedicated to uncovering the truth behind today’s most pressing political issues. Launched in 2019, the show has grown exponentially, offering unfiltered insights, comprehensive research, and real-time analysis. With a commitment to independent journalism and factual integrity, the MG Show empowers its audience with knowledge and encourages active participation in the political discourse. Ask A Spaceman Archives - 365 Days of Astronomy Ask A Spaceman Archives - 365 Days of Astronomy Podcasting Astronomy Every Day of the Year Breaking News Show | eTurboNews Juergen Thomas Steinmetz News is relevant to the global travel and tourism industry, human rights and global issues.Breaking news when it happens and only from the source. Eat to Live Jenna Fuhrman, Dr. Fuhrman Our health is our most precious gift and smart nutrition can change your life. Each month, join Dr. Fuhrman and his daughter, Jenna Fuhrman as they discuss important topics in the world of nutrition. Eat to Live will change the way you eat and think about food.

Frequently Asked Questions

How long is this episode of The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)?

This episode is 1 hour and 25 minutes long.

When was this The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) episode published?

This episode was published on June 5, 2025.

What is this episode about?

Can I download this The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.

URL copied to clipboard!