PodParley PodParley

LLM Inference Speed (Tech Deep Dive)

An episode of the Thinking Machines: AI & Philosophy podcast, hosted by Daniel Reid Cahn, titled "LLM Inference Speed (Tech Deep Dive)" was published on October 6, 2023 and runs 39 minutes.

October 6, 2023 ·39m · Thinking Machines: AI & Philosophy

0:00 / 0:00

In this tech talk, we dive deep into the technical specifics around LLM inference.The big question is: Why are LLMs slow? How can they be faster? And might slow inference affect UX in the next generation of AI-powered software?We jump into:Is fast model inference the real moat for LLM companies?What are the implications of slow model inference on the future of decentralized and edge model inference?As demand rises, what will the latency/throughput tradeoff look like?What innovations on the horizon might massively speed up model inference?

In this tech talk, we dive deep into the technical specifics around LLM inference.

The big question is: Why are LLMs slow? How can they be faster? And might slow inference affect UX in the next generation of AI-powered software?


We jump into:

  • Is fast model inference the real moat for LLM companies?
  • What are the implications of slow model inference on the future of decentralized and edge model inference?
  • As demand rises, what will the latency/throughput tradeoff look like?
  • What innovations on the horizon might massively speed up model inference?
NewAtlantis Ocean Podcast Peoples Media, JJ Ramberg Ocean biodiversity is not only critical to the health of our planet, but it’s also the foundation for our food web and has a huge impact on the global economy. On this podcast, we’ll talk about the latest discoveries in ocean science, the newest thinking on pricing nature into our economy, and how machine learning and AI are helping us better understand ocean ecology.We’ll talk to marine biologists, economists, data scientists and explorers from all over the world who will share why and how we must preserve the biodiversity in our ocean.Happy listening!Podcast cover photo used by permission of Octavio Aburto. Hosted on Acast. See acast.com/privacy for more information. Welcome to the Machine Glen Hines Football is facing an existential crisis, whether the game and its culture want to admit it or not. In Welcome to the Machine, author, former Division 1 football player, and veteran Glen Hines explores how various forces in American culture try to salvage football despite the growing medical evidence of its destructive effects. Part memoir, part cultural analysis, part chronicle of the biggest medical crisis facing American sport in over 100 years, this series is mandatory listening for parents and a cautionary tale for thinking people who continue to fuel America’s gladiatorial spectacle. Thinking On Paper Mark Fielding and Jeremy Gilbertson A technology show for the radically curious.Thinking on Paper isn't about seed rounds and funding. There are plenty of shows for the 1%. Instead, Mark and Jeremy sit down with the CEOs, founders, outliers, and engineers building the future. The premise? The human story of technology. What is the impact for the 99%?300+ episodes. Guests include IBM, Infleqtion, Nvidia, Microsoft, Kevin Kelly, Don Norman, Carissa Veliz, Philip Metzger, Skyler Chan, Pia Lauritzen, and many more.Start anywhere. Thinking Out Loud Paul Colligan Bestselling Author, Keynote Speaker and Podcasting Strategist Paul Colligan leverages the Podcast to have the very conversation we need to be having. It's almost a partner to Paul's "The Podcast Industry Report" Podcast, but it's also a stand-alone look at what's really important - and a chance to explore issues he can't cover over there.
URL copied to clipboard!