Data Engineering Central Podcast Podcast

29

Academic → CTO: What Actually Matters in Data (Matthew Housley)

Most companies don’t have a tooling problem. They have a foundation problem.In this episode, I sit down with Matthew Housley, a famed co-author of Data Engineering Fundamentals and former CTO of Ternary Data, to talk about what actually makes data teams successful and why so many organizations get it wrong despite having modern stacks, cloud platforms, and expensive dashboards.* Matthew’s path is a little different than most. He started in academia as a mathematics instructor before moving into industry as a data scientist at Overstock.com, and eventually leading data strategy and analytics as a CTO. That mix of academic rigor and real-world execution gives him a very clear perspective on where things break down.We get into the gap between data science and real business impact, why analytics foundations matter more than flashy models, and what companies consistently underestimate when building out data platforms. We also talk about what it actually looks like to transition from academia to industry, and how that shapes how you think about data problems at scale.Data Engineering Central is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.If you’ve ever felt like your data stack should be delivering more value than it is, this conversation will probably hit close to home.⏱️ Topics we cover:* Why most analytics efforts fail before they even start* The difference between “doing data” and delivering value* Data science vs data engineering vs analytics reality* Academic thinking vs industry execution* What CTOs actually care about when it comes to data* Building foundations that don’t fall apart six months laterThanks for reading Data Engineering Central! This post is public so feel free to share it. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe

May 13, 2026

55m

28

AI Isn’t Replacing Curious Developers

AI isn’t just changing how we write code. It’s changing what it even means to build software.In this episode of the Data Engineering Central Podcast, I sit down with Neil Roberts — a developer who’s been through every major wave of the web, from BASIC on an Atari to modern TypeScript, and now deep into LLMs and agentic workflows.This is not another surface-level “AI will change everything” conversation. We get into what is actually happening right now, where it works, where it completely breaks, and what developers are getting wrong.* We talk about why front-end and UX matter more than ever in an AI world, why most people misunderstand agents, and what real day-to-day workflows with LLMs actually look like. * There’s also a hard look at who benefits from AI, who falls behind, and whether we are quietly building fragile systems that we don’t fully understand.If you’re a developer trying to figure out where this is all going, this is one of those conversations worth paying attention to.Data Engineering Central is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.Expect to learn:* Why AI is as much a UX problem as it is a backend problem* What “agents” actually mean in practice, not in demos* Where LLM workflows are useful today and where they fail hard* Whether junior developers should be worried or excited* How building apps changes when AI is part of the system* What developers should actually be doing right now to stay relevantNeil also has a podcast, The Skill Tree, on AI and agentic-specific topics.We also get into a bigger question most people are avoiding:* Are we heading toward AI-assisted coding… or AI-orchestrated systems where developers become supervisors?* And maybe more importantly… which side of that shift do you want to be on?Thanks for reading Data Engineering Central! This post is public so feel free to share it. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe

May 6, 2026

1h 03m

27

AI Is Changing Data Engineering Fast

In this episode of the Data Engineering Central Podcast, I sit down with Andreas Kretz to break down what is really happening in the industry right now. We go far beyond surface-level AI hype and talk about how data engineering actually works in the real world, what skills still matter, and where most engineers are wasting time.Andreas shares his full journey from industrial IoT and working at Bosch to building one of the largest data engineering education platforms in the world, training over 2,000 students and reaching more than 100,000 engineers globally. We get into what production data systems actually look like, why most learning paths are broken, and how AI is reshaping the role of the modern data engineer.Thanks for reading Data Engineering Central! This post is public so feel free to share it.* We also dig into the uncomfortable truths. AI can write code, but it cannot replace thinking. Most engineers focus too much on tools and not enough on problem-solving, system design, and communication. That gap is only getting bigger.If you are trying to figure out how to stay relevant in data engineering, or you are just getting started and want to avoid years of wasted effort, this conversation will change how you think about your career.Today’s podcast is sponsored by Estuary.Without them, content like this isn’t possible. The best way to support this Newsletter is to check out what Estuary has to offer and click the links below.Build millisecond-latency, scalable, future-proof data pipelines in minutes.Estuary is the Right-Time Data Platform that integrates all of the systems you use to produce, process, and consume data. Also, providing best-in-class CDC (Change Data Capture).Estuary unifies today’s batch and streaming paradigms so that your systems, current and future, are synchronized around the same datasets, updating in milliseconds.What we cover:* Why most data engineers are learning the wrong things* The shift from coding to problem-solving and system design* How AI is actually changing data engineering workflows* Why courses and tutorials are becoming less effective* The difference between real production systems and “toy projects.”* The future of data engineering jobs and whether AI will replace them* Why fundamentals still matter more than everOne of the biggest takeaways is simple. The tools will keep changing, but the problems stay the same. The engineers who win are those who understand systems, ask better questions, and connect business problems to real solutions.Links:* Learn Data Engineering Academy: https://learndataengineering.com* Andreas Kretz on LinkedIn* Andreas Kretz on YouTube* Sponsor: https://estuary.devData Engineering Central is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe

Apr 29, 2026

56m

26

Most Data Teams Are Doing It Wrong

Most data teams think they’re building value. In reality, they’ve become ticket queues.In this episode, Chris Gambill explains his storied career in tech and data through the years, dealing with data at Fortune 500 company scale, and breaking out on his own.We cover career growth, what separates senior engineers from true strategic operators, and the biggest mistakes people make early on. We discuss the classic problems that have plagued data teams for decades and why it’s all still a struggle.Today’s podcast is sponsored by Estuary.Without them, content like this isn’t possible. The best way to support this Newsletter is to check out what Estuary has to offer and click the links below.Build millisecond-latency, scalable, future-proof data pipelines in minutes.Estuary is the Right-Time Data Platform that integrates all of the systems you use to produce, process, and consume data. Also, providing best-in-class CDC (Change Data Capture).Estuary unifies today’s batch and streaming paradigms so that your systems, current and future, are synchronized around the same datasets, updating in milliseconds.We also dig into Databricks vs Snowflake, what matters and what doesn’t, and how to think about modern data architecture without falling for marketing hype.* On the AI side, we talk about why most LLMs, in the context of developer lifecycles, have changed how we do data, and also about what human skills cannot be replaced.If you care about leveling up beyond just building pipelines, this one is for you.Thanks for reading Data Engineering Central! This post is public so feel free to share it. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe

Apr 22, 2026

58m

25

From Industrial Data at BASF to Delta Lake Committer

In this episode, Robert Pack walks through his journey from engineering and simulation work to building large-scale data systems across 900+ plants at BASF.We break down what those systems actually looked like, including ingestion, modeling, and the realities of batch vs real-time in industrial environments.We also dive into:* AI Workflows for Developers* His work as a committer on Delta Lake* Where lakehouse architecture works and where it falls short* The transition into Developer Relations at DatabricksThis is a grounded, practical conversation about what actually matters when building data platforms.Today’s podcast is sponsored by Estuary.Without them, content like this isn’t possible. The best way to support this Newsletter is to check out what Estuary has to offer and click the links below.Build millisecond-latency, scalable, future-proof data pipelines in minutes.Estuary is the Right-Time Data Platform that integrates all of the systems you use to produce, process, and consume data. Also, providing best-in-class CDC (Change Data Capture).Estuary unifies today’s batch and streaming paradigms so that your systems, current and future, are synchronized around the same datasets, updating in milliseconds.You can find Robert on LinkedIn and GitHub, below.Thanks for reading Data Engineering Central! This post is public so feel free to share it.Come follow me on YouTube!! This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe

Apr 15, 2026

48m

24

He Quit Apple After 13 Years

In this episode of Data Engineering Central, I sit down with Kevin, who spent 13 years working at Apple before walking away at the end of 2025.* Not to jump to another job.* Not to start a company.* But to take a step back from everything.Kevin shares his full journey—from growing up in the suburbs of Atlanta to building a career at Apple, and ultimately reaching the point where he could walk away financially and mentally.You can follow along with Kevin below.We dive deep into what it’s really like working in tech: the high salaries, the lifestyle creep, the pressure, and the surprising reality that even people making great money often have no clear financial plan.This conversation also explores the rise of FIRE (Financial Independence, Retire Early), how Kevin discovered it through Mr. Money Mustache, and why his perspective on it has changed over time.Thanks for reading Data Engineering Central! This post is public so feel free to share it.What starts as a path to freedom can easily turn into a scarcity mindset—and that’s something most people don’t talk about.We also get into:* Why high income does not equal financial freedom* The hidden trap of lifestyle inflation in tech* The simple investing strategy that actually works (and why most people ignore it)* Why many engineers are “close” to freedom—but never pull the trigger* The psychology of money, status, and why people stay stuck* How a failed project and burnout became a turning point* And how Kevin went from overworked and unhealthy… to climbing mountains and preparing to backpack 1,000 milesThis is not your typical “get rich quick” or “retire at 30” conversation. It’s a grounded, honest look at money, work, and what it actually takes to build a life you don’t need to escape from.If you work in tech, think about FIRE, or just feel like you’re stuck on the treadmill, this one will hit home.Thanks for reading Data Engineering Central! This post is public so feel free to share it. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe

Apr 1, 2026

52m

23

Spark, AI, and the Future of Data Engineering with Daniel Aronovich

In this episode of Data Engineering Central, I sit down with the founder of DataFlint, Daniel Aronovich, to talk about the realities of working with Apache Spark, distributed data systems, and the future of data engineering.We start with his early journey into tech—how he first discovered large-scale data systems and the lessons he learned from working with real-world Spark workloads.* The conversation then turns toward the future of data engineering, particularly the growing role of AI in software development and data infrastructure. We discuss why generic AI coding assistants often struggle with complex distributed systems, whether AI will eventually be able to automatically optimize data pipelines, and how the role of the data engineer may evolve in the coming years.We covered a lot of career advice for new and upcoming data professionals.We also discuss the origin of DataFlint, a tool designed to help engineers better understand and optimize Spark workloads by analyzing execution plans, logs, and runtime context.If you work with Spark, large-scale data pipelines, or modern data platforms, this conversation will give you a deeper look into how the data engineering landscape is evolving.Thanks for reading Data Engineering Central! This post is public so feel free to share it. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe

Mar 24, 2026

46m

22

DuckDB, AI, and the Future of Data Engineering

In this episode, I sit down with Matt Martin, Staff Engineer, data architect, ETL practitioner, and author of a new book on DuckDB coming soon, to talk about the past, present, and future of data engineering.Matt has spent decades building and architecting data platforms across technologies such as SQL Server, Oracle, DB2, Hadoop, Redshift, and BigQuery, and now focuses on modern tools such as DuckDB and single-node analytics.We discuss how the data industry has evolved, what actually makes data platforms succeed, and where tools like DuckDB, Polars, Databricks, and Snowflake fit into the future of analytics.We also dive into the impact of AI on coding and data engineering, and whether distributed compute clusters will remain dominant — or if more workloads will move toward high-performance single-node systems.Topics Covered* Matt’s early career and journey into data engineering* The evolution of data warehousing and ETL frameworks* Traditional enterprise data systems vs modern cloud platforms* DuckDB and the rise of single-node analytics* Polars vs DuckDB: where each tool shines* Databricks vs Snowflake* AI-assisted coding and its impact on engineers* The current data engineering job market* Lessons learned from decades of building data systems* Writing a book on DuckDB This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe

Mar 18, 2026

1h 00m

21

What Decades in Software Engineering Teaches You

In this episode of Data Engineering Central, I sit down with a veteran Software Engineer John Crickett; with decades of experience in the industry to unpack what really matters in building a long and successful engineering career.We talk about how he first got into software, the early jobs and tools that shaped his thinking, and the massive technology shifts he’s witnessed across decades of engineering—from early stacks and tools to today’s AI-assisted workflows.* We also dive into the difference between coding and real-world software engineering, what separates junior, senior, and principal engineers, and why many developers misunderstand what it takes to grow in this field.* We discuss leadership vs individual contributor paths, the origin of his Coding Challenges platform, why algorithm puzzles dominate developer culture, and what actually makes engineers improve quickly.Finally, we tackle the big question everyone is asking right now: how AI is reshaping software engineering, and what skills will matter most over the next decade. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe

Mar 11, 2026

1h 06m

20

Data Engineering, AI, and Career Growth

In this episode of the Data Engineering Central Podcast, I sit down with Yuki (Yuki Kakegawa) to talk about his journey into tech, the tools and platforms he’s worked with, and where he thinks data engineering and AI are headed next.We cover:• How Yuki got into tech• Early career lessons and pivots• Tools and technologies he’s worked with over the years• How data engineering has evolved• The impact of AI on software development• What engineers should focus on right now• Advice for those building their careers in dataYuki shares practical insights on navigating the industry, staying adaptable, and thinking long-term about your technical growth.If you’re a data engineer, aspiring engineer, or just interested in where AI and modern software are going, this one’s for you.Yuki writes on …LinkedIn - https://www.linkedin.com/in/yukikakegawa/https://yukikakegawa.me/#blogThanks for reading Data Engineering Central! This post is public so feel free to share it.🔔 Subscribe for more interviews with leaders in data engineering, AI, and modern data platforms. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe

Mar 3, 2026

47m

19

Spark, Lakehouse & AI: A Deep Conversation with Bart Konieczny

In this episode of Data Engineering Central, I sit down with Bart Konieczny — data engineer, distributed systems expert, and well-known author in the Data and Spark ecosystem — for a deep technical conversation about modern data engineering.We cover:* How Bart got into tech and distributed systems* His journey through different engineering roles* Spark internals and why they still matter* The realities of lakehouse architecture* Streaming vs batch systems* AI’s impact on data engineering* What engineers should focus on in 2026In a world obsessed with abstractions and AI tooling, we explore whether understanding the internals is still worth it — or if the game has fundamentally changed.If you’re a data engineer, architect, or platform leader trying to navigate the next phase of the lakehouse era, this one’s for you.Thanks for reading Data Engineering Central! This post is public so feel free to share it.—🎙️ Data Engineering Central PodcastHosted by Daniel BeachIf you’re a CTO or data leader looking for help building or optimizing your data platform, reach out — consulting inquiries welcome.Data Engineering Central is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe

Feb 25, 2026

44m

18

DevOps vs ClickOps with Maxine Meurer

In this episode of the Data Engineering Central Podcast, I sit down with Maxine Meurer, DevOps engineer, author, and educator behind I Love DevOps, for a wide-ranging conversation about careers, infrastructure, automation, and what it actually means to build systems that last.This isn’t a buzzword-heavy DevOps chat. It’s a grounded, honest discussion between two engineers about how people really get into tech, how careers evolve over time, and why modern infrastructure is as much about systems thinking and human judgment as it is about tools.We talk through Maxine’s journey from early technical curiosity to hands-on DevOps work, dealing with “ClickOps” to automation-first infrastructure, and how writing and teaching reshaped the way she thinks about engineering.What we cover in this episode:* 🛠️ From ClickOps to DevOps — what that transition actually looks like in the real world* 🧠 Why DevOps is fundamentally about systems and people, not just pipelines and YAML* 📚 How Maxine went from self-teaching to authoring practical guides like LLMs for Humans and The DevOps Career Switch Blueprint* 🤯 Common mistakes engineers make when learning DevOps, cloud, and distributed systems* 🔍 Testing failures, production realities, and where modern infrastructure still breaks down* 🤖 What AI and LLMs actually change for engineers, and what’s mostly hype* 🧭 Career advice for engineers without a traditional background* 🔮 Where DevOps and platform engineering are heading over the next 3–5 yearsThroughout the conversation, Maxine brings a refreshing, human-centered perspective to topics that are often over-abstracted or oversold. We dig into the tradeoffs behind tooling choices, the reality of production systems, and the importance of learning how to think, not just what to deploy.If you’re navigating a DevOps or infrastructure career, wrestling with modern stacks, or trying to make sense of AI’s role in engineering, this episode offers clarity, context, and hard-won insight.Learn more about Maxine’s work:* Writing & guides: * LinkedIn: https://www.linkedin.com/in/maxinemeurer/* Gumroad resources: https://mameurer.gumroad.comThanks for reading Data Engineering Central! This post is public so feel free to share it. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe

Feb 18, 2026

40m

17

The Evolution of Software, Streaming, and Data Engineering with Robin Moffatt

In this episode, I sit down with industry veteran Robin Moffatt — Sr. Principal Advisor in Streaming Data Technologies (Kafka, etc.) and a longtime voice in the data engineering community, to unpack the journey from old-school data architectures to today’s real-time streaming ecosystems. From early mainframe data processing and COBOL through the rise of Apache Kafka, streaming ETL, and event-driven systems, Robin shares lived experience from across decades of building, scaling, and evolving data platforms.We dive into:* 🧠 How the role of software engineering has shifted with the rise of distributed, real-time systems* 📊 Why event streaming and platforms like Kafka aren’t just messaging systems, but the backbone of modern data architectures* 🚀 How the community’s tooling and mental models have had to evolve — from static databases and nightly jobs to continuous, always-on streaming applications* 🤖 A candid look at how AI and real-time data are intersecting, shaping both tooling and expectations for the next decade* 🔮 Robin’s perspective on where the industry is headed — beyond buzzwords toward real engineering maturityAlong the way, we get historical context, real-world lessons from conference stages and community forums, and a perspective on building resilient, scalable systems that power today’s data-rich applications.If you’ve ever wondered how we got from batch jobs to continuous event streams, or what it really takes to build modern pipelines that support AI workflows, this conversation with Robin is a must-listen.For more from Robin:* 📍 His personal blog & talks: https://rmoff.net/* 🔗 LinkedIn profile: https://www.linkedin.com/in/robinmoffattThanks for reading Data Engineering Central! This post is public so feel free to share it. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe

Feb 9, 2026

50m

16

The Lakehouse Architecture: Multimodal Data, Delta Lake, and the Future of Data Engineering (with R. Tyler Croy)

In this episode of the Data Engineering Central Podcast, I sit down with R. Tyler Croy for a wide-ranging conversation on the present—and future—of modern data platforms.Tyler is a long-time open-source contributor to projects such as delta-rs. You can watch him on YouTube, read his blog, or work directly with him through his consultancy, Buoyant Data.Tyler has spent years deep in the open-source data ecosystem, contributing to projects such as Delta Lake and thinking critically about how real-world data systems are built and maintained. This isn’t a hype-driven conversation—it’s a grounded discussion about what’s working, what’s breaking, and what’s coming next.We dig into:* What the Lakehouse architecture gets right—and where it still falls short* Why multimodal data (text, images, audio, video, embeddings) changes everything* How open table formats like Delta Lake fit into the next generation of data platforms* The growing gap between data tooling hype and day-to-day data engineering reality* What skills and architectural thinking will matter most for data engineers over the next decadeIf you’re building or operating modern data platforms—and trying to separate real signal from noise—this episode is for you.Thanks for reading Data Engineering Central! This post is public so feel free to share it. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe

Feb 3, 2026

59m

15

Building the Full Data Stack and the Audience That Comes With It

In this episode of the Data Engineering Central Podcast, I sit down with Hoyt Emerson, founder of The Full Data Stack and Early Signal, for a wide-ranging conversation on data, analytics, and creating content in the tech world.We talk candidly about:* What actually matters in modern data and analytics* Why so much “data content” misses the mark* The difference between noise and real signal* What works (and doesn’t) when building a technical audience* Writing, consistency, and credibility in the data space* Why opinions + experience beat trends and buzzwordsIf you’re a data engineer, analyst, or technologist who’s curious about both building better data systems and communicating ideas that resonate, this episode goes deep on the lessons learned from doing both.This is less about hacks—and more about craft, judgment, and long-term thinking.Thanks for reading Data Engineering Central! This post is public so feel free to share it. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe

Jan 28, 2026

46m

14

From Wiring Circuits to Data Pipelines

In this episode of the Data Engineering Central Podcast, I sit down with Andy Leonard — someone who’s been building systems long before “data engineering” was even a job title.Andy’s career didn’t start in software at all. It started with physical circuits, literally wiring systems as an electrician, before moving into programming, databases, and eventually decades of hands-on data engineering work.This conversation isn’t about trends or hype cycles. It’s about how the fundamentals of data work have evolved, what hasn’t changed, and what you only learn after years of building, breaking, fixing, and rebuilding real systems.We talk about how the industry got here, how tools have changed, where they haven’t helped as much as advertised, and what newer data engineers can learn from a long, practical career spent close to the metal.If you’re interested in perspective, experience, and lessons earned the hard way — this one’s for you.Thanks for reading Data Engineering Central! This post is public so feel free to share it. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe

Jan 20, 2026

2h 10m

13

From DBA to Data Everything

In this episode of the Data Engineering Central Podcast, I interview a Data OG, someone who’s been around the data space forever, and we talked about all things data, past, present, and future.I’m joined by Thomas Horton a longtime friend and one of the most well-rounded data professionals I know. Over the course of his career, Tom has worn just about every hat in data: developer, DBA, analyst, and everything in between. He’s lived through the era of on-prem databases, the rise of analytics, and the constant reinvention that defines modern data engineering today.We talk about what’s changed, what hasn’t, and why many of the “new” problems in data feel oddly familiar. We also dig into lessons learned the hard way, lessons that are just as relevant for early-career data engineers as they are for seasoned practitioners navigating today’s ever-expanding stacks.On a personal note, a huge portion of what I know about relational databases and analytics can be traced back to Tom. This conversation is part reflection, part history lesson, and part reality check on where the data industry is headed next.* If you’re interested in the past, present, and future of data—and what really matters beneath all the tooling, this is an episode you won’t want to miss.Thanks for reading Data Engineering Central! This post is public so feel free to share it. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe

Jan 14, 2026

1h 06m

12

Scott Haines on the Future of Data Engineering

In this episode, I sit down with Scott Haines — O’Reilly author, Databricks MVP, and veteran of Yahoo, Nike, and Twilio — for a wide-ranging conversation on the real state of modern data engineering. We dig into open-source ecosystems, Lakehouse architectures, the evolution of Spark, streaming, what’s broken and what’s working in today’s data tooling, and the lessons Scott has learned scaling platforms at some of the biggest companies in the world.If you care about data engineering, architecture, OSS, or the future of the modern data stack, you’ll love this one.Thanks for reading Data Engineering Central! This post is public so feel free to share it.Make sure to follow Scott here on Substack, and over on GitHub. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe

Dec 17, 2025

1h 51m

11

Data Engineering Central Podcast - 09

Hello! A new episode of the Data Engineering Central Podcast is dropping today. We will be covering a few hot topics!* Cluster Fatigue* The Death of Open SourceGoing to be a great show, come along for the ride!Thanks for reading Data Engineering Central! This post is public so feel free to share it. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe

Nov 13, 2025

6m

10

Data Engineering Central Podcast - Episode 8

This is a free preview of a paid episode. To hear more, visit dataengineeringcentral.substack.comHello! A new episode of the Data Engineering Central Podcast is dropping today, we will be covering a few hot topics!* Apache Iceberg Catalogs* new Boring Catalog* new full Iceberg support from Databricks/Unity Catalog* Databricks SQL Scripting* DuckDB coming to a Lake House near you* Lakebase from DatabricksGoing to be a great show, come along for the ride!Thanks …

Jul 10, 2025

5m

9

Apache Iceberg Rant.

Hello, my fair-weathered friends and readers! I am gone on vacation this week with my family, probably at this moment lying in the sand on a beach (Lord willing the creek don’t rise), not thinking of you all.Anywho, be that as it may, I didn’t want you to miss my pretty face, so here is a video of me ranting about Apache Iceberg, something I’ve had a lot of practice doing and enjoy quite thoroughly.For all you free-loaders out there, you can get 20% off to celebrate Memorial Day.https://dataengineeringcentral.substack.com/Merica This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe

May 26, 2025

11m

8

Data Engineering Central Podcast - 07

This is a free preview of a paid episode. To hear more, visit dataengineeringcentral.substack.comIt’s time for another episode of the Data Engineering Central Podcast. In this episode, we cover …* Rust-based tool called UV to replace pip and poetry etc* Apache X-Table and the Future of the Lake House* How is AI going to affect you?Thanks for being a consumer of Data Engineering Central; your support means a lot. Please share this podcast with your friend…

Apr 2, 2025

3m

7

Data Engineering Central Podcast - 06

It’s time for another episode of the Data Engineering Central Podcast. In this episode, we cover …* AWS Lambda + DuckDB and Delta Lake (Polars, Daft, etc).* IAC - Long Live Terraform.* Databricks Data Quality with DQX.* Unity Catalog releases for DuckDB and Polars* Bespoke vs Managed Data Platforms* Delta Lake vs. Iceberg and UinFORM for a single table.Thanks for b… This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe

Feb 13, 2025

21m

6

Data Engineering Central Podcast - 05

In todays episode of Data Engineering Central Podcast we talk about a few hot topics, AWS S3 Tables, Databricks raising money, are Data Contracts Dead, and the Lake House Storage Format battle!It's a good one, buckle up! This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe

Dec 20, 2024

21m

5

Data Engineering Central Podcast - 04

It’s time for another episode of the Data Engineering Central Podcast. In this episode we cover …* Apache Airflow vs Databricks Workflows* End-of-Year Engineering Planning for 2025* 10 Billion Row Challenge with DuckDB vs Daft vs Polars* Raw Data Ingestion.As usual, the full episode is available to paid subscribers, and a shortened version to you free loaders out there, don’t worry, I still love you though. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe

Nov 20, 2024

22m

4

Data Engineering Central Podcast - 03

It’s time for another episode of Data Engineering Central Podcast, our third one! Topics in this episode …* Should you use DuckDB or Polars?* Small Engineering Changes (PR Reviews)* Daft vs Spark on Databricks with Unity Catalog (Delta Lake)* Primary and Foreign keys in the Lake HouseEnjoy! This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe

Oct 16, 2024

15m

3

Data Engineering Central Podcast - 02

Welcome to the Data Engineering Central Podcast —— a no-holds-barred discussion on the Data Landscape.Welcome to Episode 02In today’s episode, we will talk about the following topics from the Data Engineering perspective …* Using OpenAI’s o1 Model to do Data Engineering work* Lord Save us from more ETL tools* Rust for the small things* Hosted (SaaS) vs Build This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe

Oct 4, 2024

23m

2