Automatic Data Pipelining: One More Turtle Ahead episode artwork

EPISODE · May 15, 2026 · 40 MIN

Automatic Data Pipelining: One More Turtle Ahead

from Adventures in DevOps · host Will Button, Warren Parad

Share Episode                  We grabbed Donald Nguyen, co-founder and CTO at Corvic, to discuss the absurd complexities of enterprise data and multimodal inference. We explore how organizations habitually hoard mountains of useless, "dead" data just out of the sheer fantascy that someone might ask for it later. We highlight the fundamental disconnect where data collectors using tools like Airbyte and Kafka speak a completely different language than the business consumers analyzing it in Excel.         True scale isn't just about managing petabytes; it's the absolute nightmare of extracting subjective business meaning from flat PDFs and invoices. In the deep-end of vector embeddings, we're challenging translating data into a different semantic universe requires imposing a heavy business bias. Auditors and artists will view the exact same invoice completely differently, meaning your embedding model selection is incredibly subjective to the business context.         The industry's desperate search for actual AI success stories beyond basic workflow automation is still ongoing as we laugh—and cry—at the reality that companies are likely budgeting 50% of an engineer's salary for LLM token usage, effectively enabling product managers to burn cash on infinite loops to generate prototype code. Reasonable or unreasonable?         And lastly, we tackle the existential dread of securing autonomous AI agents. Because fine-grained access control for agent actions is basically an unsolved fantasy, we must treat their execution environments as entirely untrusted, relying on rigid sandboxes like AWS Firecracker VMs. Prompt injection attacks are an inevitable flaw of the transformer architecture, and the industry's best defense mechanism seems to be wrapping models inside of other models to validate the outputs. It is quite literally turtles all the way down, and the winner of enterprise security is simply the organization that manages to put one more turtle ahead of the attackers.          💡 Notable Links:         Kuuk Thaayorre Aboriginal Tribe - Cardinal Directions✨ Episode: Generating automatic integrations at scale🎯 Picks:         Warren - Dr. NEMO: Clockwise circle pitDonald - Book: InvestiGators

Share Episode                  We grabbed Donald Nguyen, co-founder and CTO at Corvic, to discuss the absurd complexities of enterprise data and multimodal inference. We explore how organizations habitually hoard mountains of useless, "dead" data just out of the sheer fantascy that someone might ask for it later. We highlight the fundamental disconnect where data collectors using tools like Airbyte and Kafka speak a completely different language than the business consumers analyzing it in Excel.         True scale isn't just about managing petabytes; it's the absolute nightmare of extracting subjective business meaning from flat PDFs and invoices. In the deep-end of vector embeddings, we're challenging translating data into a different semantic universe requires imposing a heavy business bias. Auditors and artists will view the exact same invoice completely differently, meaning your embedding model selection is incredibly subjective to the business context.         The industry's desperate search for actual AI success stories beyond basic workflow automation is still ongoing as we laugh—and cry—at the reality that companies are likely budgeting 50% of an engineer's salary for LLM token usage, effectively enabling product managers to burn cash on infinite loops to generate prototype code. Reasonable or unreasonable?         And lastly, we tackle the existential dread of securing autonomous AI agents. Because fine-grained access control for agent actions is basically an unsolved fantasy, we must treat their execution environments as entirely untrusted, relying on rigid sandboxes like AWS Firecracker VMs. Prompt injection attacks are an inevitable flaw of the transformer architecture, and the industry's best defense mechanism seems to be wrapping models inside of other models to validate the outputs. It is quite literally turtles all the way down, and the winner of enterprise security is simply the organization that manages to put one more turtle ahead of the attackers.          💡 Notable Links:         Kuuk Thaayorre Aboriginal Tribe - Cardinal Directions✨ Episode: Generating automatic integrations at scale🎯 Picks:         Warren - Dr. NEMO: Clockwise circle pitDonald - Book: InvestiGators

NOW PLAYING

Automatic Data Pipelining: One More Turtle Ahead

0:00 40:11

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

MG Show MG Show The MG Show, hosted by Jeffrey Pedersen and Shannon Townsend, is a leading alternative media platform dedicated to uncovering the truth behind today’s most pressing political issues. Launched in 2019, the show has grown exponentially, offering unfiltered insights, comprehensive research, and real-time analysis. With a commitment to independent journalism and factual integrity, the MG Show empowers its audience with knowledge and encourages active participation in the political discourse. Eat to Live Jenna Fuhrman, Dr. Fuhrman Our health is our most precious gift and smart nutrition can change your life. Each month, join Dr. Fuhrman and his daughter, Jenna Fuhrman as they discuss important topics in the world of nutrition. Eat to Live will change the way you eat and think about food. French Your Way Jessica: Native French teacher founder of French Your Way Boost your French listening skills and test your comprehension with this one of a kind series of podcasts. Get the chance to listen to a real conversation between native speakers talking at normal speed AND customise your learning experience through carefully designed sets of questions (2 levels of difficulty) available for download at www.frenchvoicespodcast.com. All interviews also come with the transcript. French teacher Jessica interviews native speakers of French from around the world who share a bit of their life and passion. Where else would you meet in one same place a French yoga teacher based in Melbourne, a soap manufacturer from Provence, or a couple cycling around the world? XXX Tech by SOVRYN Dr. Brian Sovryn The crossroads between technology, sensuality, and metaphysics - and the longest running anarchist podcast in the world! Brought to you by Dr. Brian Sovryn.

Frequently Asked Questions

How long is this episode of Adventures in DevOps?

This episode is 40 minutes long.

When was this Adventures in DevOps episode published?

This episode was published on May 15, 2026.

What is this episode about?

Share Episode                  We grabbed Donald Nguyen, co-founder and CTO at Corvic, to discuss the absurd complexities of enterprise data and multimodal inference. We explore how organizations habitually hoard mountains of useless, "dead" data...

Can I download this Adventures in DevOps episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!