PodParley PodParley

Will inference move to the edge?

Shifting AI inference from hyperscale data centers to smaller edge data centers – and even consumer devices – could have big implications for energy.

An episode of the Catalyst with Shayle Kann podcast, hosted by Latitude Media, titled "Will inference move to the edge?" was published on December 18, 2025 and runs 47 minutes.

December 18, 2025 ·47m · Catalyst with Shayle Kann

0:00 / 0:00

Today virtually all AI compute takes place in centralized data centers, driving the demand for massive power infrastructure. But as workloads shift from training to inference, and AI applications become more latency-sensitive (autonomous vehicles, anyone?), there‘s another pathway: migrating a portion of inference from centralized computing to the edge. Instead of a gigawatt-scale data center in a remote location, we might see a fleet of smaller data centers clustered around an urban core. Some inference might even shift to our devices.  So how likely is a shift like this, and what would need to happen for it to substantially reshape AI power? In this episode, Shayle talks to Dr. Ben Lee, a professor of electrical engineering and computer science at the University of Pennsylvania, as well as a visiting researcher at Google. Shayle and Ben cover topics like: The three main categories of compute: hyperscale, edge, and on-device Why training is unlikely to move from hyperscale The low latency demands of new applications like autonomous vehicles How generative AI is training us to tolerate longer latencies  Why distributed inference doesn‘t face the same technical challenges as distributed training Why consumer devices may limit model capability  Resources: ACM SIGMETRICS Performance Evaluation Review: A Case Study of Environmental Footprints for Generative AI Inference: Cloud versus Edge Internet of Things and Cyber-Physical Systems: Edge AI: A survey Credits: Hosted by Shayle Kann. Produced and edited by Daniel Woldorff. Original music and engineering by Sean Marquand. Stephen Lacey is our executive editor.  Catalyst is brought to you by EnergyHub. EnergyHub helps utilities build next-generation virtual power plants that unlock reliable flexibility at every level of the grid. See how EnergyHub helps unlock the power of flexibility at scale, and deliver more value through cross-DER dispatch with their leading Edge DERMS platform, by visiting energyhub.com. Catalyst is brought to you by Bloom Energy. AI data centers can’t wait years for grid power—and with Bloom Energy’s fuel cells, they don’t have to. Bloom Energy delivers affordable, always-on, ultra-reliable onsite power, built for chipmakers, hyperscalers, and data center leaders looking to power their operations at AI speed. Learn more by visiting⁠ ⁠⁠BloomEnergy.com⁠. Catalyst is supported by Third Way. Third Way’s new PACE study surveyed over 200 clean energy professionals to pinpoint the non-cost barriers delaying clean energy deployment today and offers practical solutions to help get projects over the finish line. Read Third Way's full report, and learn more about their PACE initiative, at www.thirdway.org/pace.

Today virtually all AI compute takes place in centralized data centers, driving the demand for massive power infrastructure. But as workloads shift from training to inference, and AI applications become more latency-sensitive (autonomous vehicles, anyone?), there‘s another pathway: migrating a portion of inference from centralized computing to the edge. Instead of a gigawatt-scale data center in a remote location, we might see a fleet of smaller data centers clustered around an urban core. Some inference might even shift to our devices.  So how likely is a shift like this, and what would need to happen for it to substantially reshape AI power? In this episode, Shayle talks to Dr. Ben Lee, a professor of electrical engineering and computer science at the University of Pennsylvania, as well as a visiting researcher at Google. Shayle and Ben cover topics like: The three main categories of compute: hyperscale, edge, and on-device Why training is unlikely to move from hyperscale The low latency demands of new applications like autonomous vehicles How generative AI is training us to tolerate longer latencies  Why distributed inference doesn‘t face the same technical challenges as distributed training Why consumer devices may limit model capability  Resources: ACM SIGMETRICS Performance Evaluation Review: A Case Study of Environmental Footprints for Generative AI Inference: Cloud versus Edge Internet of Things and Cyber-Physical Systems: Edge AI: A survey Credits: Hosted by Shayle Kann. Produced and edited by Daniel Woldorff. Original music and engineering by Sean Marquand. Stephen Lacey is our executive editor.  Catalyst is brought to you by EnergyHub. EnergyHub helps utilities build next-generation virtual power plants that unlock reliable flexibility at every level of the grid. See how EnergyHub helps unlock the power of flexibility at scale, and deliver more value through cross-DER dispatch with their leading Edge DERMS platform, by visiting energyhub.com. Catalyst is brought to you by Bloom Energy. AI data centers can’t wait years for grid power—and with Bloom Energy’s fuel cells, they don’t have to. Bloom Energy delivers affordable, always-on, ultra-reliable onsite power, built for chipmakers, hyperscalers, and data center leaders looking to power their operations at AI speed. Learn more by visiting⁠ ⁠⁠BloomEnergy.com⁠. Catalyst is supported by Third Way. Third Way’s new PACE study surveyed over 200 clean energy professionals to pinpoint the non-cost barriers delaying clean energy deployment today and offers practical solutions to help get projects over the finish line. Read Third Way's full report, and learn more about their PACE initiative, at www.thirdway.org/pace.
Catalyst - Capacity Building for Careers in the Social Sector Sauraveswar Sen Catalyst was established in 2011 in Kolkata, India with an aim to create prospective human service professionals who will enable people to overcome poverty, deprivation and unemployment. Today, the institute keeps the vision alive through extensive online teaching, research, policy support and field action programs. Every year by enabling students all across India to successfully qualify the Bachelors, Masters, MPhil-PhD Fellowship Entrance Tests, Interviews of globally recognised social science institutions, Catalyst continues to produce committed high quality prospective professionals in a range of social and human development sectors; health, mental health, social epidemiology, clinical psychology, education and vocational skill development, human resource management, media and cultural studies, rural and urban development, livelihoods and social entrepreneurship, climate change, disaster management, regulatory governance, focused work on women, children, adolescents, youth, aged, d The Catalyst Social Hour Steve Glynn A monthly podcast bringing people and ideas together. Each month we invite entrepreneurs, artists, innovators, and makers to share their stories with us live at Transfer Pizzeria Cafe in Milwaukee. The Mental Catalyst - Daily Nuggets Michael Amankwa The Mental Catalyst (TMC) seeks to touch a “Billion Healthy Lives” in 20 years through sharing intriguing personal life experiences with the world. It is about building a community where all persons can contribute by sharing personal life experiences to motivate one another. Making people feel good, motivated and inspired goes a long way to impact on their health in general. GBOCT X-Treme Faith Ministries with Bishop Kevin Lydell Smith Bishop Kevin Lydell Smith In an age of increasing violence, and in a world that no longer reflects God’s power, there stands a man that is poised to be a catalyst of positive change on God’s behalf. Bishop Kevin Lydell Smith, Senior Pastor of The Greater Body of Christ Temple / XTreme Faith Ministries, is that man; and he being formally introduced to the world as a spokesman for God.
URL copied to clipboard!