EPISODE · May 27, 2026 · 9 MIN
How Netflix Built Chaos Engineering for APIs
from The API Podcast with Fexingo: REST, GraphQL, and Modern Web APIs · host Fexingo
Episode 14 of The API Podcast digs into chaos engineering for APIs — the practice of deliberately breaking your own endpoints to find weaknesses before they break in production. Lucas and Luna walk through how Netflix developed the Simian Army, starting with Chaos Monkey in 2011, and why failure injection is now a standard resilience technique for large-scale API systems. They break down the concrete tools: Chaos Monkey for random instance termination, Chaos Kong for simulating a full AWS region outage, and the post-incident patterns that make a team actually learn from the chaos. They also discuss how smaller teams can apply chaos engineering without Netflix-scale infrastructure, using open-source tools like Litmus and Gremlin's free tier. No dry theory — just a real engineering story and a practical framework for making your API fault-tolerant. #ChaosEngineering #Netflix #SimianArmy #ChaosMonkey #ChaosKong #APIReliability #Resilience #FaultInjection #AWSRegionFailure #OpenSource #LitmusChaos #Gremlin #IncidentResponse #Technology #APIPodcast #FexingoBusiness #BusinessPodcast #SoftwareEngineering Keep every episode free: buymeacoffee.com/fexingo
What this episode covers
Episode 14 of The API Podcast digs into chaos engineering for APIs — the practice of deliberately breaking your own endpoints to find weaknesses before they break in production. Lucas and Luna walk through how Netflix developed the Simian Army, starting with Chaos Monkey in 2011, and why failure injection is now a standard resilience technique for large-scale API systems. They break down the concrete tools: Chaos Monkey for random instance termination, Chaos Kong for simulating a full AWS region outage, and the post-incident patterns that make a team actually learn from the chaos. They also discuss how smaller teams can apply chaos engineering without Netflix-scale infrastructure, using open-source tools like Litmus and Gremlin's free tier. No dry theory — just a real engineering story and a practical framework for making your API fault-tolerant. #ChaosEngineering #Netflix #SimianArmy #ChaosMonkey #ChaosKong #APIReliability #Resilience #FaultInjection #AWSRegionFailure #OpenSource #LitmusChaos #Gremlin #IncidentResponse #Technology #APIPodcast #FexingoBusiness #BusinessPodcast #SoftwareEngineering Keep every episode free: buymeacoffee.com/fexingo
NOW PLAYING
How Netflix Built Chaos Engineering for APIs
No transcript for this episode yet
Similar Episodes
Mar 26, 2026 ·1m
Mar 19, 2026 ·34m
Feb 18, 2026 ·11m
Feb 11, 2026 ·45m