Humans of Reliability podcast artwork

PODCAST · technology

Humans of Reliability

Behind every reliable software system, there are people working hard to keep it online. Humans of Reliability is a series that spotlights the engineers, leaders, and innovators at the heart of incident management and system reliability. Through candid conversations, we explore the challenges, lessons, and personal journeys of those navigating complex technical landscapes to ensure the systems we rely on run smoothly. From unforgettable incident stories to favorite tools, workflows, and hobbies, Humans of Reliability uncovers the human side of technology—offering insights and inspiration for anyone passionate about building and maintaining resilient systems.https://rootly.com/humans-of-reliability

  1. 32

    The Golden Hour: Why the First 15 Minutes of an Incident Decide Everything w/ Gandhi M. N. Kumar (Twillio)

    Most incident response advice focuses on tools, alerts, and post-mortems. Gandhi Mathi Nathan Kumar, Principal Incident Commander at Twilio, with 14 years running calls that have pulled in up to 100 responders, argues the work that actually matters happens in the first 15 minutes. In this episode, Gandhi walks through what he calls the golden hour: the window where you decide whether you know what's broken, who belongs on the call, and whether to chase the root cause or reach for redundancy. He gets into why mitigation has to come before diagnosis, why customers trust your status page more than your engineers, and why he once sat with a stopwatch counting how many clicks it took to declare an incident. Along the way: the human side leaders keep underinvesting in, the math of on-call fatigue, and where AI is actually pulling weight in the incident commander seat.

Type above to search every episode's transcript for a word or phrase. Matches are scoped to this podcast.

Searching…

We're indexing this podcast's transcripts for the first time — this can take a minute or two. We'll show results as soon as they're ready.

No matches for "" in this podcast's transcripts.

Showing of matches

No topics indexed yet for this podcast.

Loading reviews...

ABOUT THIS SHOW

Behind every reliable software system, there are people working hard to keep it online. Humans of Reliability is a series that spotlights the engineers, leaders, and innovators at the heart of incident management and system reliability. Through candid conversations, we explore the challenges, lessons, and personal journeys of those navigating complex technical landscapes to ensure the systems we rely on run smoothly. From unforgettable incident stories to favorite tools, workflows, and hobbies, Humans of Reliability uncovers the human side of technology—offering insights and inspiration for anyone passionate about building and maintaining resilient systems.https://rootly.com/humans-of-reliability

HOSTED BY

Rootly

CATEGORIES

Frequently Asked Questions

How many episodes does Humans of Reliability have?

Humans of Reliability currently has 1 episodes available on PodParley. New episodes are automatically indexed when they're published to the podcast feed.

What is Humans of Reliability about?

Behind every reliable software system, there are people working hard to keep it online. Humans of Reliability is a series that spotlights the engineers, leaders, and innovators at the heart of incident management and system reliability. Through candid conversations, we explore the challenges,...

How often does Humans of Reliability release new episodes?

Humans of Reliability has 1 episodes. Check the episode list to see recent publication dates and frequency.

Where can I listen to Humans of Reliability?

You can listen to Humans of Reliability on PodParley by clicking any episode. We provide an embedded audio player for direct listening, and you can also subscribe via your preferred podcast app using the RSS feed.

Who hosts Humans of Reliability?

Humans of Reliability is created and hosted by Rootly.
URL copied to clipboard!