PODCAST · technology
Reliability Rebels
by Amin Astaneh
The Reliability Rebels Podcast explores making software and systems more reliable by challenging the status quo. We sometimes have to challenge past decisions, existing technology, and even company culture when improving how we run production. This podcast will explore real-life examples from our guests and reveal insights and techniques applicable to your career and team. Intended audience- humans in the tech industry, especially software engineers and their leaders, product managers, and DevOps/Site Reliability Engineering practitioners.
-
12
Episode 12: Amin Astaneh
In this special episode, the tables are turned — host Amin Astaneh becomes the guest. Stephen Townshend, host of Slight Reliability, interviews Amin about rebuilding his life as a nomad: traveling North America, living out of a pickup truck, and running his business from the road. A personal conversation about freedom, simplicity, and the human side of a life in tech. Show Notes Available at https://podcast.certomodo.io/amin-astaneh.html.
-
11
Episode 11: Sylvain Kalache
AI agents are triaging incidents and writing runbooks- but are LLMs actually the right tool for operational work? Sylvain Kalache, Head of AI Labs at Rootly, shares research on where AI SRE tools add real value, where they fall apart, and what it means for operational maturity when humans only see the hardest problems. Guest: Sylvain Kalache, Head of AI Labs at Rootly (https://rootly.com). Show Notes Available at https://podcast.certomodo.io/sylvain-kalache.html.
-
10
Episode 10: Kyle Forster
Explores the 'AI code tsunami' and how massive, AI-generated code changes are forcing engineering teams to rethink traditional code reviews, observability, and the future of SRE roles. The conversation highlights a shift toward treating test environments like production and using narrowly scoped AI agents to manage system reliability, guided by simplified, binary SLIs and SLOs. Guest: Kyle Forster, founder and CEO of RunWhen (https://runwhen.com). Show Notes Available at https://podcast.certomodo.io/kyle-forster.html.
-
9
Episode 9: Jon Reeve
Discusses the 'complexity cult' of the current observability industry, how the open-source TUI tool Gonzo can reveal infrastructure insights using novel use of LLMs for sentiment analysis, and the vision of more accessible observability experiences for software engineers. Guest: Jon Reeve, founder and CPO of ControlTheory (controltheory.com). Show Notes Available at https://podcast.certomodo.io/jon-reeve.html.
-
8
Episode 8: Aaron 'Checo' Pacheco
Explores monitoring and observability evolution, examining how observability costs now consume 15-25% of infrastructure budgets with Aaron Pacheco from Ottermon.ai. Show Notes Available at https://podcast.certomodo.io/aaron-pacheco.html.
-
7
Episode 7: Sebastian Vietz
Discusses how naming conventions shape industry perceptions, with focus on AI SRE terminology with Sebastian Vietz from Compass Digital. Show Notes Available at https://podcast.certomodo.io/sebastian-vietz.html.
-
6
Episode 6: Chris Evans
Explores whether automation through AI actually reduces toil or just shifts it elsewhere with Chris Evans from Incident.io. Show Notes Available at https://podcast.certomodo.io/chris-evans.html.
-
5
Episode 5: Derek Brown
Compares infrastructure management at large tech companies versus smaller organizations with Derek Brown from Plaid. Show Notes Available at https://podcast.certomodo.io/derek-brown.html.
-
4
Episode 4: Kat Gaines
Examines incident management beyond technical fixes, emphasizing communication and customer experience with Kat Gaines from PagerDuty. Show Notes Available at https://podcast.certomodo.io/kat-gaines.html.
-
3
Episode 3: Michael Abed
Chronicles resolving a complex production incident at Meta lasting over three days with Michael Abed from Datadog. Show Notes Available at https://podcast.certomodo.io/michael-abed.html.
-
2
Episode 2: Ricardo Amaro
Reflects on early DevOps initiatives at Acquia with Ricardo Amaro, who authored a chapter on ML-driven capacity planning in Seeking SRE. Show Notes Available at https://podcast.certomodo.io/ricardo-amaro.html.
-
1
Episode 1: Rick Gorman
Inaugural episode exploring technical debt, testing approaches, and blameless culture with software engineer Rick Gorman. Show Notes Available at https://podcast.certomodo.io/rick-gorman.html.
We're indexing this podcast's transcripts for the first time — this can take a minute or two. We'll show results as soon as they're ready.
No matches for "" in this podcast's transcripts.
No topics indexed yet for this podcast.
Loading reviews...
ABOUT THIS SHOW
The Reliability Rebels Podcast explores making software and systems more reliable by challenging the status quo. We sometimes have to challenge past decisions, existing technology, and even company culture when improving how we run production. This podcast will explore real-life examples from our guests and reveal insights and techniques applicable to your career and team. Intended audience- humans in the tech industry, especially software engineers and their leaders, product managers, and DevOps/Site Reliability Engineering practitioners.
HOSTED BY
Amin Astaneh
CATEGORIES
Loading similar podcasts...