EPISODE · Mar 14, 2025 · 1H 6M
GenAI Traffic: Why API Infrastructure Must Evolve... Again // Erica Hughberg // #296
from MLOps.community · host Demetrios
GenAI Traffic: Why API Infrastructure Must Evolve... Again // MLOps Podcast #296 with Erica Hughberg, Community Advocate at Tetrate.Join the Community: https://go.mlops.community/YTJoinIn Get the newsletter: https://go.mlops.community/YTNewsletter // AbstractThe way we handle API traffic is broken for GenAI. We've spent years optimizing for microservices—fast, stateless, and lightweight API calls. But GenAI changes everything. Requests are slower, heavier, and more complex, requiring long-lived connections, massive payloads, and streaming responses. Suddenly, traditional API gateways are struggling—timeout limits are too short, rate limiting models don’t fit, and payload constraints are blocking innovation. In this episode, we unpack the new challenges of GenAI traffic and why infrastructure must evolve—again. We look back at previous API shifts, from the C10K problem to the monolith-to-microservices revolution, and how they reshaped networking. Now, AI-driven workloads demand a new kind of API gateway—one that handles token-based rate limiting, cost-aware request shaping, and scalable AI inference traffic.// BioErica Hughberg is a technical leader and community advocate passionate about helping engineering teams build scalable, secure, and human-centric application platforms. With a background in software engineering and a deep understanding of cloud-native technologies, she specializes in driving the adoption of open-source projects like Envoy Gateway, Istio, and Kubernetes Gateway API, which enable organizations to simplify traffic management, security, and API distribution. As a maintainer of Envoy AI Gateway, she plays a key role in shaping the future of API infrastructure. She focuses on features to ensure organizations can securely and efficiently integrate AI-powered services while simplifying traffic management, security, and API distribution. In the Envoy community, she drives collaboration, mentorship, and contributions that advance the project and its adoption. Lastly, as a believer in the power of storytelling, Erica enjoys translating complex technical concepts into engaging, accessible narratives in the form of social media posts, conference talks, podcasts, and educational content.// Related LinksEfficient Deployment of Models at the Edge // Krishna Sridhar // MLOps Podcast #284 - https://youtu.be/sFqm7GTeulg~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExploreJoin our Slack community [https://go.mlops.community/slack]Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register]MLOps Swag/Merch: [https://shop.mlops.community/]Connect with Demetrios on LinkedIn: /dpbrinkmConnect with Erica on LinkedIn: /ericahughbergTimestamps:[00:00] Erica's preferred coffee[00:30] Takeaways[01:50] Evolving Web Gateways [14:35] Microservices to LLM Shift[17:42] Intelligence Privacy Model[22:26] Infrastructure for AI Creativity[25:25] AI Gateway Networking Challenges[30:37] Streamlit MVP to Production[43:03] AI Model Scaling Challenges[47:48] Tech Advocacy and Skills[53:17] Optimizing Edge AI Performance[56:43] Product Management Insights[1:00:02] Navigating Evolving Tech Challenges[1:04:35] Wrap up
What this episode covers
GenAI Traffic: Why API Infrastructure Must Evolve... Again // MLOps Podcast #296 with Erica Hughberg, Community Advocate at Tetrate.Join the Community: https://go.mlops.community/YTJoinIn Get the newsletter: https://go.mlops.community/YTNewsletter // AbstractThe way we handle API traffic is broken for GenAI. We've spent years optimizing for microservices—fast, stateless, and lightweight API calls. But GenAI changes everything. Requests are slower, heavier, and more complex, requiring long-lived connections, massive payloads, and streaming responses. Suddenly, traditional API gateways are struggling—timeout limits are too short, rate limiting models don’t fit, and payload constraints are blocking innovation. In this episode, we unpack the new challenges of GenAI traffic and why infrastructure must evolve—again. We look back at previous API shifts, from the C10K problem to the monolith-to-microservices revolution, and how they reshaped networking. Now, AI-driven workloads demand a new kind of API gateway—one that handles token-based rate limiting, cost-aware request shaping, and scalable AI inference traffic.// BioErica Hughberg is a technical leader and community advocate passionate about helping engineering teams build scalable, secure, and human-centric application platforms. With a background in software engineering and a deep understanding of cloud-native technologies, she specializes in driving the adoption of open-source projects like Envoy Gateway, Istio, and Kubernetes Gateway API, which enable organizations to simplify traffic management, security, and API distribution. As a maintainer of Envoy AI Gateway, she plays a key role in shaping the future of API infrastructure. She focuses on features to ensure organizations can securely and efficiently integrate AI-powered services while simplifying traffic management, security, and API distribution. In the Envoy community, she drives collaboration, mentorship, and contributions that advance the project and its adoption. Lastly, as a believer in the power of storytelling, Erica enjoys translating complex technical concepts into engaging, accessible narratives in the form of social media posts, conference talks, podcasts, and educational content.// Related LinksEfficient Deployment of Models at the Edge // Krishna Sridhar // MLOps Podcast #284 - https://youtu.be/sFqm7GTeulg~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExploreJoin our Slack community [https://go.mlops.community/slack]Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register]MLOps Swag/Merch: [https://shop.mlops.community/]Connect with Demetrios on LinkedIn: /dpbrinkmConnect with Erica on LinkedIn: /ericahughbergTimestamps:[00:00] Erica's preferred coffee[00:30] Takeaways[01:50] Evolving Web Gateways [14:35] Microservices to LLM Shift[17:42] Intelligence Privacy Model[22:26] Infrastructure for AI Creativity[25:25] AI Gateway Networking Challenges[30:37] Streamlit MVP to Production[43:03] AI Model Scaling Challenges[47:48] Tech Advocacy and Skills[53:17] Optimizing Edge AI Performance[56:43] Product Management Insights[1:00:02] Navigating Evolving Tech Challenges[1:04:35] Wrap up
NOW PLAYING
GenAI Traffic: Why API Infrastructure Must Evolve... Again // Erica Hughberg // #296
No transcript for this episode yet
Similar Episodes
Apr 21, 2026 ·13m
Apr 19, 2026 ·16m
Apr 17, 2026 ·13m
Apr 13, 2026 ·11m
Apr 11, 2026 ·16m