How Datadog Monitors Its Own Infrastructure episode artwork

EPISODE · Jun 18, 2026 · 8 MIN

How Datadog Monitors Its Own Infrastructure

from The CTO Podcast with Fexingo: Technical Leadership, Architecture, and Engineering Org · host Fexingo

Episode 58 of The CTO Podcast goes inside Datadog's engineering org to explore how the company monitors its own 100-terabyte infrastructure. Lucas and Luna walk through Datadog's dogfooding culture, the architectural challenges of running a monitoring platform for itself, and how the team handles alert fatigue, distributed tracing, and log ingestion at massive scale. They discuss specific tools like the Datadog Agent, the trace-agent, and the custom time-series database built in-house. The episode includes concrete numbers: 30 trillion time-series points ingested daily, 99.99 percent uptime target, and how the SRE team manages 8,000 hosts across multiple cloud providers. Tune in for a rare look at how the watcher watches itself. #Datadog #InfrastructureMonitoring #Dogfooding #SRE #Observability #TimeSeriesDatabase #DistributedTracing #AlertFatigue #CloudInfrastructure #EngineeringCulture #SiteReliabilityEngineering #DevOps #BusinessAndTechnology #FexingoBusiness #BusinessPodcast #CTO #TechnicalLeadership #Architecture Keep every episode free: buymeacoffee.com/fexingo

Episode 58 of The CTO Podcast goes inside Datadog's engineering org to explore how the company monitors its own 100-terabyte infrastructure. Lucas and Luna walk through Datadog's dogfooding culture, the architectural challenges of running a monitoring platform for itself, and how the team handles alert fatigue, distributed tracing, and log ingestion at massive scale. They discuss specific tools like the Datadog Agent, the trace-agent, and the custom time-series database built in-house. The episode includes concrete numbers: 30 trillion time-series points ingested daily, 99.99 percent uptime target, and how the SRE team manages 8,000 hosts across multiple cloud providers. Tune in for a rare look at how the watcher watches itself. #Datadog #InfrastructureMonitoring #Dogfooding #SRE #Observability #TimeSeriesDatabase #DistributedTracing #AlertFatigue #CloudInfrastructure #EngineeringCulture #SiteReliabilityEngineering #DevOps #BusinessAndTechnology #FexingoBusiness #BusinessPodcast #CTO #TechnicalLeadership #Architecture Keep every episode free: buymeacoffee.com/fexingo

NOW PLAYING

How Datadog Monitors Its Own Infrastructure

0:00 8:13

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

Frequently Asked Questions

How long is this episode of The CTO Podcast with Fexingo: Technical Leadership, Architecture, and Engineering Org?

This episode is 8 minutes long.

When was this The CTO Podcast with Fexingo: Technical Leadership, Architecture, and Engineering Org episode published?

This episode was published on June 18, 2026.

What is this episode about?

Episode 58 of The CTO Podcast goes inside Datadog's engineering org to explore how the company monitors its own 100-terabyte infrastructure. Lucas and Luna walk through Datadog's dogfooding culture, the architectural challenges of running a...

Can I download this The CTO Podcast with Fexingo: Technical Leadership, Architecture, and Engineering Org episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!