GenAI Traffic: Why API Infrastructure Must Evolve... Again // Erica Hughberg // #296 episode artwork

EPISODE · Mar 14, 2025 · 1H 6M

GenAI Traffic: Why API Infrastructure Must Evolve... Again // Erica Hughberg // #296

from MLOps.community · host Demetrios

GenAI Traffic: Why API Infrastructure Must Evolve... Again // MLOps Podcast #296 with Erica Hughberg, Community Advocate at Tetrate.Join the Community: https://go.mlops.community/YTJoinIn Get the newsletter: https://go.mlops.community/YTNewsletter // AbstractThe way we handle API traffic is broken for GenAI. We've spent years optimizing for microservices—fast, stateless, and lightweight API calls. But GenAI changes everything. Requests are slower, heavier, and more complex, requiring long-lived connections, massive payloads, and streaming responses. Suddenly, traditional API gateways are struggling—timeout limits are too short, rate limiting models don’t fit, and payload constraints are blocking innovation. In this episode, we unpack the new challenges of GenAI traffic and why infrastructure must evolve—again. We look back at previous API shifts, from the C10K problem to the monolith-to-microservices revolution, and how they reshaped networking. Now, AI-driven workloads demand a new kind of API gateway—one that handles token-based rate limiting, cost-aware request shaping, and scalable AI inference traffic.// BioErica Hughberg is a technical leader and community advocate passionate about helping engineering teams build scalable, secure, and human-centric application platforms. With a background in software engineering and a deep understanding of cloud-native technologies, she specializes in driving the adoption of open-source projects like Envoy Gateway, Istio, and Kubernetes Gateway API, which enable organizations to simplify traffic management, security, and API distribution. As a maintainer of Envoy AI Gateway, she plays a key role in shaping the future of API infrastructure. She focuses on features to ensure organizations can securely and efficiently integrate AI-powered services while simplifying traffic management, security, and API distribution. In the Envoy community, she drives collaboration, mentorship, and contributions that advance the project and its adoption. Lastly, as a believer in the power of storytelling, Erica enjoys translating complex technical concepts into engaging, accessible narratives in the form of social media posts, conference talks, podcasts, and educational content.// Related LinksEfficient Deployment of Models at the Edge // Krishna Sridhar // MLOps Podcast #284 - https://youtu.be/sFqm7GTeulg~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExploreJoin our Slack community [https://go.mlops.community/slack]Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register]MLOps Swag/Merch: [https://shop.mlops.community/]Connect with Demetrios on LinkedIn: /dpbrinkmConnect with Erica on LinkedIn: /ericahughbergTimestamps:[00:00] Erica's preferred coffee[00:30] Takeaways[01:50] Evolving Web Gateways [14:35] Microservices to LLM Shift[17:42] Intelligence Privacy Model[22:26] Infrastructure for AI Creativity[25:25] AI Gateway Networking Challenges[30:37] Streamlit MVP to Production[43:03] AI Model Scaling Challenges[47:48] Tech Advocacy and Skills[53:17] Optimizing Edge AI Performance[56:43] Product Management Insights[1:00:02] Navigating Evolving Tech Challenges[1:04:35] Wrap up

GenAI Traffic: Why API Infrastructure Must Evolve... Again // MLOps Podcast #296 with Erica Hughberg, Community Advocate at Tetrate.Join the Community: https://go.mlops.community/YTJoinIn Get the newsletter: https://go.mlops.community/YTNewsletter // AbstractThe way we handle API traffic is broken for GenAI. We've spent years optimizing for microservices—fast, stateless, and lightweight API calls. But GenAI changes everything. Requests are slower, heavier, and more complex, requiring long-lived connections, massive payloads, and streaming responses. Suddenly, traditional API gateways are struggling—timeout limits are too short, rate limiting models don’t fit, and payload constraints are blocking innovation. In this episode, we unpack the new challenges of GenAI traffic and why infrastructure must evolve—again. We look back at previous API shifts, from the C10K problem to the monolith-to-microservices revolution, and how they reshaped networking. Now, AI-driven workloads demand a new kind of API gateway—one that handles token-based rate limiting, cost-aware request shaping, and scalable AI inference traffic.// BioErica Hughberg is a technical leader and community advocate passionate about helping engineering teams build scalable, secure, and human-centric application platforms. With a background in software engineering and a deep understanding of cloud-native technologies, she specializes in driving the adoption of open-source projects like Envoy Gateway, Istio, and Kubernetes Gateway API, which enable organizations to simplify traffic management, security, and API distribution. As a maintainer of Envoy AI Gateway, she plays a key role in shaping the future of API infrastructure. She focuses on features to ensure organizations can securely and efficiently integrate AI-powered services while simplifying traffic management, security, and API distribution. In the Envoy community, she drives collaboration, mentorship, and contributions that advance the project and its adoption. Lastly, as a believer in the power of storytelling, Erica enjoys translating complex technical concepts into engaging, accessible narratives in the form of social media posts, conference talks, podcasts, and educational content.// Related LinksEfficient Deployment of Models at the Edge // Krishna Sridhar // MLOps Podcast #284 - https://youtu.be/sFqm7GTeulg~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExploreJoin our Slack community [https://go.mlops.community/slack]Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register]MLOps Swag/Merch: [https://shop.mlops.community/]Connect with Demetrios on LinkedIn: /dpbrinkmConnect with Erica on LinkedIn: /ericahughbergTimestamps:[00:00] Erica's preferred coffee[00:30] Takeaways[01:50] Evolving Web Gateways [14:35] Microservices to LLM Shift[17:42] Intelligence Privacy Model[22:26] Infrastructure for AI Creativity[25:25] AI Gateway Networking Challenges[30:37] Streamlit MVP to Production[43:03] AI Model Scaling Challenges[47:48] Tech Advocacy and Skills[53:17] Optimizing Edge AI Performance[56:43] Product Management Insights[1:00:02] Navigating Evolving Tech Challenges[1:04:35] Wrap up

NOW PLAYING

GenAI Traffic: Why API Infrastructure Must Evolve... Again // Erica Hughberg // #296

0:00 1:06:24

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

She’s a Hazard to Herself She’s a Hazard Hi there, I’m Mallory, and I’d like to invite you into our world with “She’s a Hazard to Herself!” Join us as we navigate life with Multiple Sclerosis from the seat of my power wheelchair. Discover stories of resilience, family, and the community we’ve built around chronic illness. Whether you’re impacted by MS or want to learn from our journey, there’s something here for you. So why wait? Subscribe to “She’s a Hazard to Herself” on your favorite podcast app and be part of our journey today. Let’s lift each other up, one episode at a time! Tips, News and Stories for Older Adults Esther C Kane CAPS, C.D.S. "Tips, News, and Stories for Older Adults" delivers weekly insights tailored for seniors. We bring you summaries of curated news, practical advice, and inspiring stories that matter to the 55+ community. From health and finance to technology and lifestyle, our content keeps you informed and engaged. Sourced from trusted outlets, each episode offers valuable information for navigating your golden years. Join us as we explore aging with positivity, wisdom, and engaging stories. Your perfect companion for staying active, learning, and embracing life's later chapters. Prayer Time Heir Waves Prayer Time A podcast especially for our Prayer Time community NEWMORROW SESSIONS - A PodCast Series on the Future of Hospitality Mario C. Bauer, Florian Schneider, Axel Weber & Dr. Tillman Bardt The Newmorrow PodCast is more than a podcast — it's a platform for open dialog on the future of our business, a platform for those building what doesn’t exist yet. Here, we share and embrace our passion for the hospitality industry, but we won’t romanticize the journey. We ask the tough questions, confront uncomfortable truths, and prepare for a future that resists easy answers. We believe that the tougher and wilder times become, the more openly, honestly and humanely people need to talk to each other and act together. We believe, openness, togetherness, and truthfulness should also be cornerstones of a professional community to develop our utopian idea of „open source“. This is a space where visionaries don’t just imagine the future — they wrestle with the paradoxes that shape it: success vs. happiness, data vs. instinct, stability vs. reinvention. Join leaders, entrepreneurs, and thinkers as they share not what made them — but what’s actively shaping them, now and next. So tune in

Frequently Asked Questions

How long is this episode of MLOps.community?

This episode is 1 hour and 6 minutes long.

When was this MLOps.community episode published?

This episode was published on March 14, 2025.

What is this episode about?

GenAI Traffic: Why API Infrastructure Must Evolve... Again // MLOps Podcast #296 with Erica Hughberg, Community Advocate at Tetrate.Join the Community: https://go.mlops.community/YTJoinIn Get the newsletter: https://go.mlops.community/YTNewsletter...

Can I download this MLOps.community episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!