EPISODE · Jun 18, 2026 · 8 MIN
How Datadog Monitors Its Own Infrastructure
from The CTO Podcast with Fexingo: Technical Leadership, Architecture, and Engineering Org · host Fexingo
Episode 58 of The CTO Podcast goes inside Datadog's engineering org to explore how the company monitors its own 100-terabyte infrastructure. Lucas and Luna walk through Datadog's dogfooding culture, the architectural challenges of running a monitoring platform for itself, and how the team handles alert fatigue, distributed tracing, and log ingestion at massive scale. They discuss specific tools like the Datadog Agent, the trace-agent, and the custom time-series database built in-house. The episode includes concrete numbers: 30 trillion time-series points ingested daily, 99.99 percent uptime target, and how the SRE team manages 8,000 hosts across multiple cloud providers. Tune in for a rare look at how the watcher watches itself. #Datadog #InfrastructureMonitoring #Dogfooding #SRE #Observability #TimeSeriesDatabase #DistributedTracing #AlertFatigue #CloudInfrastructure #EngineeringCulture #SiteReliabilityEngineering #DevOps #BusinessAndTechnology #FexingoBusiness #BusinessPodcast #CTO #TechnicalLeadership #Architecture Keep every episode free: buymeacoffee.com/fexingo
What this episode covers
Episode 58 of The CTO Podcast goes inside Datadog's engineering org to explore how the company monitors its own 100-terabyte infrastructure. Lucas and Luna walk through Datadog's dogfooding culture, the architectural challenges of running a monitoring platform for itself, and how the team handles alert fatigue, distributed tracing, and log ingestion at massive scale. They discuss specific tools like the Datadog Agent, the trace-agent, and the custom time-series database built in-house. The episode includes concrete numbers: 30 trillion time-series points ingested daily, 99.99 percent uptime target, and how the SRE team manages 8,000 hosts across multiple cloud providers. Tune in for a rare look at how the watcher watches itself. #Datadog #InfrastructureMonitoring #Dogfooding #SRE #Observability #TimeSeriesDatabase #DistributedTracing #AlertFatigue #CloudInfrastructure #EngineeringCulture #SiteReliabilityEngineering #DevOps #BusinessAndTechnology #FexingoBusiness #BusinessPodcast #CTO #TechnicalLeadership #Architecture Keep every episode free: buymeacoffee.com/fexingo
NOW PLAYING
How Datadog Monitors Its Own Infrastructure
No transcript for this episode yet
Similar Episodes
Mar 26, 2026 ·1m
Mar 19, 2026 ·34m
Feb 18, 2026 ·11m
Feb 11, 2026 ·45m