How SRE Teams Use Postmortem Action Items to Prevent Recurrence episode artwork

EPISODE · Jun 19, 2026 · 8 MIN

How SRE Teams Use Postmortem Action Items to Prevent Recurrence

from The Site Reliability Podcast with Fexingo: SRE, Uptime, and Production Engineering · host Fexingo

In Episode 60, Lucas and Luna dive into the most overlooked part of incident response: the postmortem action items that actually prevent the same outage from happening twice. They unpack a 2025 study from Google's SRE team that found 67% of postmortem action items are never completed, and explore why. Using concrete examples from a major AWS S3 outage and a Stripe payment-processing incident, they discuss common failure modes like vague ownership, lack of prioritization, and action items that don't address root causes. The hosts also share practical tactics: assigning a single DRI per action, using 'blameless' language to increase completion rates, and tying action items directly to error budget burn. A must-listen for any engineer who has ever written a postmortem only to see the same incident happen again. #SiteReliabilityEngineering #SRE #Postmortem #IncidentResponse #ActionItems #GoogleSRE #AWS #Stripe #ErrorBudget #BlamelessCulture #ReliabilityEngineering #Uptime #IncidentManagement #RootCauseAnalysis #DevOps #Technology #FexingoBusiness #BusinessPodcast Keep every episode free: buymeacoffee.com/fexingo

In Episode 60, Lucas and Luna dive into the most overlooked part of incident response: the postmortem action items that actually prevent the same outage from happening twice. They unpack a 2025 study from Google's SRE team that found 67% of postmortem action items are never completed, and explore why. Using concrete examples from a major AWS S3 outage and a Stripe payment-processing incident, they discuss common failure modes like vague ownership, lack of prioritization, and action items that don't address root causes. The hosts also share practical tactics: assigning a single DRI per action, using 'blameless' language to increase completion rates, and tying action items directly to error budget burn. A must-listen for any engineer who has ever written a postmortem only to see the same incident happen again. #SiteReliabilityEngineering #SRE #Postmortem #IncidentResponse #ActionItems #GoogleSRE #AWS #Stripe #ErrorBudget #BlamelessCulture #ReliabilityEngineering #Uptime #IncidentManagement #RootCauseAnalysis #DevOps #Technology #FexingoBusiness #BusinessPodcast Keep every episode free: buymeacoffee.com/fexingo

NOW PLAYING

How SRE Teams Use Postmortem Action Items to Prevent Recurrence

0:00 8:15

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

Frequently Asked Questions

How long is this episode of The Site Reliability Podcast with Fexingo: SRE, Uptime, and Production Engineering?

This episode is 8 minutes long.

When was this The Site Reliability Podcast with Fexingo: SRE, Uptime, and Production Engineering episode published?

This episode was published on June 19, 2026.

What is this episode about?

In Episode 60, Lucas and Luna dive into the most overlooked part of incident response: the postmortem action items that actually prevent the same outage from happening twice. They unpack a 2025 study from Google's SRE team that found 67% of...

Can I download this The Site Reliability Podcast with Fexingo: SRE, Uptime, and Production Engineering episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!