Chang She on Data Infrastructure for AI episode artwork

EPISODE · May 14, 2026 · 48 MIN

Chang She on Data Infrastructure for AI

from Generative AI in the Real World · host O'Reilly

As a pandas core contributor and early Parquet adopter who built AI data pipelines at streaming company Tubi TV, Chang She saw firsthand why the traditional data stack breaks down for AI workloads—and founded LanceDB to fix it. Chang joined Ben Lorica to explain why vector databases are too narrow a solution for modern AI data needs, and what a true multimodal data infrastructure actually looks like. Chang and Ben get into why the Lance file format is quickly becoming the open source standard for multimodal data, how the rise of agents is exploding data infrastructure demands, why open-weight models are the enterprise cost shift to watch in the next 12 months, and more. "Trillion is the new billion," Chang says, and the enterprises that set up their data infrastructure now for that scale will be the ones that succeed.

As a pandas core contributor and early Parquet adopter who built AI data pipelines at streaming company Tubi TV, Chang She saw firsthand why the traditional data stack breaks down for AI workloads—and founded LanceDB to fix it. Chang joined Ben Lorica to explain why vector databases are too narrow a solution for modern AI data needs, and what a true multimodal data infrastructure actually looks like. Chang and Ben get into why the Lance file format is quickly becoming the open source standard for multimodal data, how the rise of agents is exploding data infrastructure demands, why open-weight models are the enterprise cost shift to watch in the next 12 months, and more. "Trillion is the new billion," Chang says, and the enterprises that set up their data infrastructure now for that scale will be the ones that succeed.

NOW PLAYING

Chang She on Data Infrastructure for AI

0:00 48:33

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

Frequently Asked Questions

How long is this episode of Generative AI in the Real World?

This episode is 48 minutes long.

When was this Generative AI in the Real World episode published?

This episode was published on May 14, 2026.

What is this episode about?

As a pandas core contributor and early Parquet adopter who built AI data pipelines at streaming company Tubi TV, Chang She saw firsthand why the traditional data stack breaks down for AI workloads—and founded LanceDB to fix it. Chang joined Ben...

Can I download this Generative AI in the Real World episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!