How Netflix Built Chaos Engineering for APIs episode artwork

EPISODE · May 27, 2026 · 9 MIN

How Netflix Built Chaos Engineering for APIs

from The API Podcast with Fexingo: REST, GraphQL, and Modern Web APIs · host Fexingo

Episode 14 of The API Podcast digs into chaos engineering for APIs — the practice of deliberately breaking your own endpoints to find weaknesses before they break in production. Lucas and Luna walk through how Netflix developed the Simian Army, starting with Chaos Monkey in 2011, and why failure injection is now a standard resilience technique for large-scale API systems. They break down the concrete tools: Chaos Monkey for random instance termination, Chaos Kong for simulating a full AWS region outage, and the post-incident patterns that make a team actually learn from the chaos. They also discuss how smaller teams can apply chaos engineering without Netflix-scale infrastructure, using open-source tools like Litmus and Gremlin's free tier. No dry theory — just a real engineering story and a practical framework for making your API fault-tolerant. #ChaosEngineering #Netflix #SimianArmy #ChaosMonkey #ChaosKong #APIReliability #Resilience #FaultInjection #AWSRegionFailure #OpenSource #LitmusChaos #Gremlin #IncidentResponse #Technology #APIPodcast #FexingoBusiness #BusinessPodcast #SoftwareEngineering Keep every episode free: buymeacoffee.com/fexingo

Episode 14 of The API Podcast digs into chaos engineering for APIs — the practice of deliberately breaking your own endpoints to find weaknesses before they break in production. Lucas and Luna walk through how Netflix developed the Simian Army, starting with Chaos Monkey in 2011, and why failure injection is now a standard resilience technique for large-scale API systems. They break down the concrete tools: Chaos Monkey for random instance termination, Chaos Kong for simulating a full AWS region outage, and the post-incident patterns that make a team actually learn from the chaos. They also discuss how smaller teams can apply chaos engineering without Netflix-scale infrastructure, using open-source tools like Litmus and Gremlin's free tier. No dry theory — just a real engineering story and a practical framework for making your API fault-tolerant. #ChaosEngineering #Netflix #SimianArmy #ChaosMonkey #ChaosKong #APIReliability #Resilience #FaultInjection #AWSRegionFailure #OpenSource #LitmusChaos #Gremlin #IncidentResponse #Technology #APIPodcast #FexingoBusiness #BusinessPodcast #SoftwareEngineering Keep every episode free: buymeacoffee.com/fexingo

NOW PLAYING

How Netflix Built Chaos Engineering for APIs

0:00 9:37

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

Frequently Asked Questions

How long is this episode of The API Podcast with Fexingo: REST, GraphQL, and Modern Web APIs?

This episode is 9 minutes long.

When was this The API Podcast with Fexingo: REST, GraphQL, and Modern Web APIs episode published?

This episode was published on May 27, 2026.

What is this episode about?

Episode 14 of The API Podcast digs into chaos engineering for APIs — the practice of deliberately breaking your own endpoints to find weaknesses before they break in production. Lucas and Luna walk through how Netflix developed the Simian Army,...

Can I download this The API Podcast with Fexingo: REST, GraphQL, and Modern Web APIs episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!