Anton Teaches Packy AI | Ep 2 | Chinchilla episode artwork

EPISODE · Nov 25, 2022 · 1H 2M

Anton Teaches Packy AI | Ep 2 | Chinchilla

from "Age of Miracles" · host Packy McCormick | Turpentine

We're back! In Episode 2, Anton Teaches Packy about Deepmind's March 2022 paper, Training Compute-Optimal Large Language Models, or as it's more commonly known, Chinchilla.  Prior to Chinchilla, the best way to improve the performance of LLMs was thought to be by scaling up the size of the model. As a result, the largest models now have over 500 billion parameters. But there are only so many GPUs in the world, and throwing compute at the problem is expensive and energy intensive. In this paper, Deepmind found that the optimal way to scale an LLM is actually by scaling size (parameters) and training (data) proportionally.  Given the race for size, today's models are plenty big but need a lot more data.    In this conversation, we go deep on the paper itself, but we also zoom out to talk about the politics of AI, when AGI is going to hit, where to get more data, and why AI won't take our jobs. This one gets a lot more philosophical than our first episode as we explore the implications of Chinchilla and LLMs more generally.  If you enjoyed this conversation, subscribe for more. We're going to try to release one episode per week, and we want to make this the best way to get a deeper understanding of the mind-blowing progress happening in AI and what it means for everything we do as humans.   LINKS: Training Compute-Optimal Large Language Models: https://arxiv.org/abs/2203.15556  chinchilla's wild implications: https://www.lesswrong.com/posts/6Fpvc...  Scaling Laws for Neural Language Models (Kaplan et al): https://arxiv.org/abs/2001.08361 --- Send in a voice message: https://podcasters.spotify.com/pod/show/ageofmiracles/message

We're back! In Episode 2, Anton Teaches Packy about Deepmind's March 2022 paper, Training Compute-Optimal Large Language Models, or as it's more commonly known, Chinchilla.  Prior to Chinchilla, the best way to improve the performance of LLMs was thought to be by scaling up the size of the model. As a result, the largest models now have over 500 billion parameters. But there are only so many GPUs in the world, and throwing compute at the problem is expensive and energy intensive. In this paper, Deepmind found that the optimal way to scale an LLM is actually by scaling size (parameters) and training (data) proportionally.  Given the race for size, today's models are plenty big but need a lot more data.    In this conversation, we go deep on the paper itself, but we also zoom out to talk about the politics of AI, when AGI is going to hit, where to get more data, and why AI won't take our jobs. This one gets a lot more philosophical than our first episode as we explore the implications of Chinchilla and LLMs more generally.  If you enjoyed this conversation, subscribe for more. We're going to try to release one episode per week, and we want to make this the best way to get a deeper understanding of the mind-blowing progress happening in AI and what it means for everything we do as humans.   LINKS: Training Compute-Optimal Large Language Models: https://arxiv.org/abs/2203.15556  chinchilla's wild implications: https://www.lesswrong.com/posts/6Fpvc...  Scaling Laws for Neural Language Models (Kaplan et al): https://arxiv.org/abs/2001.08361 --- Send in a voice message: https://podcasters.spotify.com/pod/show/ageofmiracles/message

NOW PLAYING

Anton Teaches Packy AI | Ep 2 | Chinchilla

0:00 1:02:45

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

Big Old Life: Heather Blackbird interviews people on planet earth. Heather Blackbird loves asking questions. This podcast is a learning experience. Join me, Heather Blackbird, as I talk to people about their lives. Frequency of new episodes is a little all over the place and I'm learning as I go. Big Old Life is a small way of talking about the vastness of life, one person at a time. If you are reading this or found this podcast it's probably because someone you know gave you a link to it. :) Explicit Tales Of A Superstar DJ The Insomniac Spun seemingly out of nowhere from her complacent life in the corporate world, turned seemingly overnight from 16-Hour shift work and into the life of a literally starving artist and working musician, The Protagonist navigates her supposed rise to fame and superstardom on a journey through spiritual awakening, coming-of-age, and intimate self-realization--guided by an omnipresent force and equipped with the power of love, magic, and music. {Enter The Multiverse.} [The Festival Project] The Festival Project, Inc.™ is a multidimensional multimedia platform which encompasses exploratory and artistic social personifications and expressions on cosmic theory, spirituality, growth, health & wellness, philosophy and theoretic dynamics in entertainment such as music, design, film, television, radio, dance and festival culture, art, fashion, literature, and science. The Festival Project™ and its subsidiary Non-Profit, The Collective Complex © aims to challenge modern artistic and philosop Explicit Bitcoin Is Dead Trey Carson Welcome to Bitcoin is Dead, the ultimate Bitcoin variety show where host Trey takes you on a journey through the ever-evolving world of Bitcoin. Each episode brings new personalities, fascinating locations, and insightful conversations with politicians, educators, and innovators shaping the future of Bitcoin. Whether you're a seasoned Bitcoiner or just starting your journey, tune in for thought-provoking discussions, unique perspectives, and a deep dive into the ideas and people driving the Bitcoin revolution. Explicit The Sacred +Profane Podcast nephtaragrace The Sacred + Profane Podcast is a provocative conversation dedicated to cementing a better future for all. We specialize in unpacking the nuances of what is considered sacred and profane, particularly focusing on sex, death, and all that pertains to the circle of life. Our aim in focusing on such ”taboo” subject matter is to demystify what is unconscious, bring to light what has been known for centuries as ”the occult,” and empower the rapid transformation that is occurring on the Planet. Explicit

Frequently Asked Questions

How long is this episode of "Age of Miracles"?

This episode is 1 hour and 2 minutes long.

When was this "Age of Miracles" episode published?

This episode was published on November 25, 2022.

What is this episode about?

We're back! In Episode 2, Anton Teaches Packy about Deepmind's March 2022 paper, Training Compute-Optimal Large Language Models, or as it's more commonly known, Chinchilla.  Prior to Chinchilla, the best way to improve the performance of LLMs was...

Can I download this "Age of Miracles" episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!