How Datadog Monitors Its Own 100-Terabyte Infrastructure episode artwork

EPISODE · Jun 16, 2026 · 9 MIN

How Datadog Monitors Its Own 100-Terabyte Infrastructure

from The CTO Podcast with Fexingo: Technical Leadership, Architecture, and Engineering Org · host Fexingo

Episode 54 of The CTO Podcast: Lucas and Luna explore how Datadog, the monitoring giant, uses its own tools to manage a sprawling infrastructure that ingests over 100 terabytes of data daily. They dive into the dogfooding strategy, the architectural choices that keep observability scalable, and the surprising insight that Datadog runs its entire backend on a single PostgreSQL fork — with custom sharding. Lucas explains the engineering org structure behind the monitoring team, and Luna questions whether dogfooding can blind teams to customer pain. Specific examples include how Datadog handles metric cardinality explosion and why they built a separate time-series database internally before launching it as a product. #Datadog #Observability #Dogfooding #TechLeadership #Infrastructure #PostgreSQL #Scalability #TimeSeriesDatabase #EngineeringCulture #Monitoring #CTOPodcast #FexingoBusiness #BusinessPodcast #Architecture #Sharding #MetricCardinality #SRE #CloudNative Keep every episode free: buymeacoffee.com/fexingo

Episode 54 of The CTO Podcast: Lucas and Luna explore how Datadog, the monitoring giant, uses its own tools to manage a sprawling infrastructure that ingests over 100 terabytes of data daily. They dive into the dogfooding strategy, the architectural choices that keep observability scalable, and the surprising insight that Datadog runs its entire backend on a single PostgreSQL fork — with custom sharding. Lucas explains the engineering org structure behind the monitoring team, and Luna questions whether dogfooding can blind teams to customer pain. Specific examples include how Datadog handles metric cardinality explosion and why they built a separate time-series database internally before launching it as a product. #Datadog #Observability #Dogfooding #TechLeadership #Infrastructure #PostgreSQL #Scalability #TimeSeriesDatabase #EngineeringCulture #Monitoring #CTOPodcast #FexingoBusiness #BusinessPodcast #Architecture #Sharding #MetricCardinality #SRE #CloudNative Keep every episode free: buymeacoffee.com/fexingo

NOW PLAYING

How Datadog Monitors Its Own 100-Terabyte Infrastructure

0:00 9:55

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

Frequently Asked Questions

How long is this episode of The CTO Podcast with Fexingo: Technical Leadership, Architecture, and Engineering Org?

This episode is 9 minutes long.

When was this The CTO Podcast with Fexingo: Technical Leadership, Architecture, and Engineering Org episode published?

This episode was published on June 16, 2026.

What is this episode about?

Episode 54 of The CTO Podcast: Lucas and Luna explore how Datadog, the monitoring giant, uses its own tools to manage a sprawling infrastructure that ingests over 100 terabytes of data daily. They dive into the dogfooding strategy, the architectural...

Can I download this The CTO Podcast with Fexingo: Technical Leadership, Architecture, and Engineering Org episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!