EPISODE · Jun 14, 2026 · 9 MIN
How SRE Teams Use Feature Flags to Reduce Deployment Risk
from The Site Reliability Podcast with Fexingo: SRE, Uptime, and Production Engineering · host Fexingo
In Episode 51 of The Site Reliability Podcast, Lucas and Luna explore how SRE teams use feature flags—not just for canary releases, but as a core tool to decouple deployment from release, reduce blast radius, and enable instant rollback without redeploying. They walk through a real incident at a major streaming company where a misconfigured flag caused a 47-minute partial outage, and how the team later rebuilt their flag lifecycle with expiration dates, audit trails, and mandatory approvals for 'kill switches'. Lucas explains the difference between boolean flags, multivariate flags, and permission-based flags, and why treating flags as 'technical debt' is critical for long-term reliability. The episode also touches on how feature flags intersect with observability—specifically, how teams instrument their flag state changes to correlate with metrics in dashboards. If you've ever wondered why your feature toggles keep piling up, this episode gives you a concrete process to clean them up. #SRE #SiteReliabilityEngineering #FeatureFlags #DeploymentRisk #ReleaseManagement #IncidentResponse #Observability #Toggles #KillSwitch #FlagDebt #ContinuousDelivery #SoftwareEngineering #Technology #Podcast #FexingoBusiness #BusinessPodcast #TechOps #ProductionEngineering Keep every episode free: buymeacoffee.com/fexingo
What this episode covers
In Episode 51 of The Site Reliability Podcast, Lucas and Luna explore how SRE teams use feature flags—not just for canary releases, but as a core tool to decouple deployment from release, reduce blast radius, and enable instant rollback without redeploying. They walk through a real incident at a major streaming company where a misconfigured flag caused a 47-minute partial outage, and how the team later rebuilt their flag lifecycle with expiration dates, audit trails, and mandatory approvals for 'kill switches'. Lucas explains the difference between boolean flags, multivariate flags, and permission-based flags, and why treating flags as 'technical debt' is critical for long-term reliability. The episode also touches on how feature flags intersect with observability—specifically, how teams instrument their flag state changes to correlate with metrics in dashboards. If you've ever wondered why your feature toggles keep piling up, this episode gives you a concrete process to clean them up. #SRE #SiteReliabilityEngineering #FeatureFlags #DeploymentRisk #ReleaseManagement #IncidentResponse #Observability #Toggles #KillSwitch #FlagDebt #ContinuousDelivery #SoftwareEngineering #Technology #Podcast #FexingoBusiness #BusinessPodcast #TechOps #ProductionEngineering Keep every episode free: buymeacoffee.com/fexingo
NOW PLAYING
How SRE Teams Use Feature Flags to Reduce Deployment Risk
No transcript for this episode yet
Similar Episodes
Mar 26, 2026 ·1m
Mar 19, 2026 ·34m
Feb 18, 2026 ·11m
Feb 11, 2026 ·45m