LLM Inference Speed (Tech Deep Dive) episode artwork

EPISODE · Oct 6, 2023 · 39 MIN

LLM Inference Speed (Tech Deep Dive)

from Thinking Machines: AI & Philosophy · host Daniel Reid Cahn

In this tech talk, we dive deep into the technical specifics around LLM inference.The big question is: Why are LLMs slow? How can they be faster? And might slow inference affect UX in the next generation of AI-powered software?We jump into:Is fast model inference the real moat for LLM companies?What are the implications of slow model inference on the future of decentralized and edge model inference?As demand rises, what will the latency/throughput tradeoff look like?What innovations on the horizon might massively speed up model inference?

In this tech talk, we dive deep into the technical specifics around LLM inference.The big question is: Why are LLMs slow? How can they be faster? And might slow inference affect UX in the next generation of AI-powered software?We jump into:Is fast model inference the real moat for LLM companies?What are the implications of slow model inference on the future of decentralized and edge model inference?As demand rises, what will the latency/throughput tradeoff look like?What innovations on the horizon might massively speed up model inference?

NOW PLAYING

LLM Inference Speed (Tech Deep Dive)

0:00 39:36

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

AI Erik's Podcast Audio Erik Conn The AI News Podcast where we talk AI. CISO Perspectives (public) N2K Networks This season on CISO Perspectives, host Kim Jones explores some of the challenges of leading through uncertainty. We explore the complexity of the changing nature of regulation and working with the federal government, the evolution of privacy and fraud, and how emerging technologies like AI and quantum computing are changing cyber. When you don’t know what questions to ask, you’re afraid to ask, or don’t know who to ask, CISO Perspectives provides the foundation for learning in this brave new world. Rich Dad's Guide to Investing II Robert T. Kiyosaki II Full Audiobook II Robert T. Kiyosaki Investing means different things to different people… and there is a huge difference between passive investing and becoming an active, engaged investor. Rich Dad’s Guide to Investing, one of the three core titles in the Rich Dad Series, covers the basic rules of investing, how to reduce your investment risk, how to convert your earned income into passive income… plus Rich Dad’s 10 Investor Controls.The Rich Dad philosophy makes a key distinction between managing your money and growing it… and understanding key principles of investing is the first step toward creating and growing wealth. This book delivers guidance, not guarantees, to help anyone begin the process of becoming an active investor on the road to financial freedom. Westenberg Joan Westenberg The Westenberg Podcast offers ideas, explainers, book notes, and reflections on technology, philosophy, and the human experience. Hosted by Joan Westenberg, each episode unpacks complex topics with clarity and depth, blending personal insights with thought-provoking analysis. It’s a space for exploring big questions and fresh perspectives in an accessible format.

Frequently Asked Questions

How long is this episode of Thinking Machines: AI & Philosophy?

This episode is 39 minutes long.

When was this Thinking Machines: AI & Philosophy episode published?

This episode was published on October 6, 2023.

What is this episode about?

In this tech talk, we dive deep into the technical specifics around LLM inference.The big question is: Why are LLMs slow? How can they be faster? And might slow inference affect UX in the next generation of AI-powered software?We jump into:Is fast...

Can I download this Thinking Machines: AI & Philosophy episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!