EPISODE · Jun 19, 2026 · 8 MIN
How SRE Teams Use Postmortem Action Items to Prevent Recurrence
from The Site Reliability Podcast with Fexingo: SRE, Uptime, and Production Engineering · host Fexingo
In Episode 60, Lucas and Luna dive into the most overlooked part of incident response: the postmortem action items that actually prevent the same outage from happening twice. They unpack a 2025 study from Google's SRE team that found 67% of postmortem action items are never completed, and explore why. Using concrete examples from a major AWS S3 outage and a Stripe payment-processing incident, they discuss common failure modes like vague ownership, lack of prioritization, and action items that don't address root causes. The hosts also share practical tactics: assigning a single DRI per action, using 'blameless' language to increase completion rates, and tying action items directly to error budget burn. A must-listen for any engineer who has ever written a postmortem only to see the same incident happen again. #SiteReliabilityEngineering #SRE #Postmortem #IncidentResponse #ActionItems #GoogleSRE #AWS #Stripe #ErrorBudget #BlamelessCulture #ReliabilityEngineering #Uptime #IncidentManagement #RootCauseAnalysis #DevOps #Technology #FexingoBusiness #BusinessPodcast Keep every episode free: buymeacoffee.com/fexingo
What this episode covers
In Episode 60, Lucas and Luna dive into the most overlooked part of incident response: the postmortem action items that actually prevent the same outage from happening twice. They unpack a 2025 study from Google's SRE team that found 67% of postmortem action items are never completed, and explore why. Using concrete examples from a major AWS S3 outage and a Stripe payment-processing incident, they discuss common failure modes like vague ownership, lack of prioritization, and action items that don't address root causes. The hosts also share practical tactics: assigning a single DRI per action, using 'blameless' language to increase completion rates, and tying action items directly to error budget burn. A must-listen for any engineer who has ever written a postmortem only to see the same incident happen again. #SiteReliabilityEngineering #SRE #Postmortem #IncidentResponse #ActionItems #GoogleSRE #AWS #Stripe #ErrorBudget #BlamelessCulture #ReliabilityEngineering #Uptime #IncidentManagement #RootCauseAnalysis #DevOps #Technology #FexingoBusiness #BusinessPodcast Keep every episode free: buymeacoffee.com/fexingo
NOW PLAYING
How SRE Teams Use Postmortem Action Items to Prevent Recurrence
No transcript for this episode yet
Similar Episodes
Mar 26, 2026 ·1m
Mar 19, 2026 ·34m
Feb 18, 2026 ·11m
Feb 11, 2026 ·45m