EPISODE · Jun 9, 2026 · 8 MIN
How SRE Teams Use Canary Deployments to Reduce Release Risk
from The Site Reliability Podcast with Fexingo: SRE, Uptime, and Production Engineering · host Fexingo
Lucas and Luna dive into canary deployments: the practice of routing a small percentage of production traffic to a new version before rolling it out broadly. Lucas explains why Netflix's 'canary clusters' and Etsy's 'feature flipping' approach revolutionized how SRE teams think about release risk, and contrasts it with the old all-at-once deploys that caused major incidents. They discuss specific strategies: using metrics comparison between canary and baseline, automatic rollback triggers, and the trade-off between speed and safety. Luna brings up the 2023 incident where a mismatched canary size led to a slow-burn outage, and they explore how teams decide on canary percentage and duration. A concrete episode for any engineer or manager responsible for production releases. #SiteReliabilityEngineering #CanaryDeployments #ReleaseManagement #ProductionEngineering #IncidentPrevention #Netflix #Etsy #ContinuousDelivery #SRE #Uptime #ReliabilityEngineering #DeploymentStrategies #Technology #FexingoBusiness #BusinessPodcast #SoftwareEngineering #DevOps #RiskMitigation Keep every episode free: buymeacoffee.com/fexingo
What this episode covers
Lucas and Luna dive into canary deployments: the practice of routing a small percentage of production traffic to a new version before rolling it out broadly. Lucas explains why Netflix's 'canary clusters' and Etsy's 'feature flipping' approach revolutionized how SRE teams think about release risk, and contrasts it with the old all-at-once deploys that caused major incidents. They discuss specific strategies: using metrics comparison between canary and baseline, automatic rollback triggers, and the trade-off between speed and safety. Luna brings up the 2023 incident where a mismatched canary size led to a slow-burn outage, and they explore how teams decide on canary percentage and duration. A concrete episode for any engineer or manager responsible for production releases. #SiteReliabilityEngineering #CanaryDeployments #ReleaseManagement #ProductionEngineering #IncidentPrevention #Netflix #Etsy #ContinuousDelivery #SRE #Uptime #ReliabilityEngineering #DeploymentStrategies #Technology #FexingoBusiness #BusinessPodcast #SoftwareEngineering #DevOps #RiskMitigation Keep every episode free: buymeacoffee.com/fexingo
NOW PLAYING
How SRE Teams Use Canary Deployments to Reduce Release Risk
No transcript for this episode yet
Similar Episodes
Mar 26, 2026 ·1m
Mar 19, 2026 ·34m
Feb 18, 2026 ·11m
Feb 11, 2026 ·45m