EPISODE · May 30, 2026 · 10 MIN
How Kubernetes Pod Autoscaling Fails Under Traffic Spikes
from DevOps Daily with Fexingo: CI/CD, Kubernetes, and Modern Software Operations · host Fexingo
In this episode, Lucas and Luna dig into the mechanics of Kubernetes Horizontal Pod Autoscaler — specifically why it often fails to keep up with sudden traffic spikes. They walk through a real-world scenario from a retail platform that saw request latency spike from under 100ms to over 2 seconds during a flash sale. The root cause wasn't resource limits or cluster size — it was the default HPA scaling metrics and cooldown windows. Lucas explains how target CPU utilization, stabilization windows, and custom metrics interact, and why relying solely on CPU-based HPA leaves you vulnerable. They discuss the alternative: using Kubernetes Event-driven Autoscaling (KEDA) with request-based metrics. If you're running Kubernetes in production and haven't stress-tested your HPA configuration, this episode will save you from a late-night incident. #Kubernetes #PodAutoscaling #HPA #KEDA #SiteReliabilityEngineering #CloudNative #DevOps #IncidentResponse #Scalability #Microservices #Containers #ProductionEngineering #TrafficSpikes #Metrics #Technology #FexingoBusiness #BusinessPodcast #DevOpsDaily Keep every episode free: buymeacoffee.com/fexingo
What this episode covers
In this episode, Lucas and Luna dig into the mechanics of Kubernetes Horizontal Pod Autoscaler — specifically why it often fails to keep up with sudden traffic spikes. They walk through a real-world scenario from a retail platform that saw request latency spike from under 100ms to over 2 seconds during a flash sale. The root cause wasn't resource limits or cluster size — it was the default HPA scaling metrics and cooldown windows. Lucas explains how target CPU utilization, stabilization windows, and custom metrics interact, and why relying solely on CPU-based HPA leaves you vulnerable. They discuss the alternative: using Kubernetes Event-driven Autoscaling (KEDA) with request-based metrics. If you're running Kubernetes in production and haven't stress-tested your HPA configuration, this episode will save you from a late-night incident. #Kubernetes #PodAutoscaling #HPA #KEDA #SiteReliabilityEngineering #CloudNative #DevOps #IncidentResponse #Scalability #Microservices #Containers #ProductionEngineering #TrafficSpikes #Metrics #Technology #FexingoBusiness #BusinessPodcast #DevOpsDaily Keep every episode free: buymeacoffee.com/fexingo
NOW PLAYING
How Kubernetes Pod Autoscaling Fails Under Traffic Spikes
No transcript for this episode yet
Similar Episodes
Mar 26, 2026 ·1m
Mar 19, 2026 ·34m
Feb 18, 2026 ·11m
Feb 11, 2026 ·45m