Fabric data lake performance: fix slow workloads with Azure Container Storage v2 and local NVMe for real‑time analytics episode artwork

EPISODE · Oct 22, 2025 · 20 MIN

Fabric data lake performance: fix slow workloads with Azure Container Storage v2 and local NVMe for real‑time analytics

from M365.FM - Modern work, security, and productivity with Microsoft 365 · host Mirko Peters - Founder of m365.fm, m365.show and m365con.net

Fabric data lake performance: in this episode of M365.fm, Mirko Peters explains why your Fabric lakehouse feels slow not because of Spark, Power BI, or engineers—but because your data lives on remote, managed storage that behaves like a networked file share from 2003. He opens with a brutal truth: every query, transform, and dashboard waits on storage latency first, and as long as your bytes commute across Azure’s network to reach compute, you are paying for CPUs to sit idle while I/O negotiations crawl along.He then unpacks how Fabric and Power Platform end up bottlenecked by their own convenience. Managed tiers promise elasticity and durability, but each layer—service fabrics, gateways, redundancy, regional routing—adds milliseconds that quietly stack into minutes on trillion‑row refreshes. Mirko likens managed storage to a postal service: reliable and distributed, but absurd when you are trying to do millisecond analytics. Meanwhile, administrators keep scaling nodes and spark pools, unknowingly feeding a bottleneck that more compute cannot fix because the physics of distance remain unchanged.From there, he introduces Azure Container Storage v2 as the NVMe fix for this drag. ACStor v2 abandons the old, complex design and goes all‑in on local NVMe disks wired directly to the host’s PCIe lanes, stripping out managed disks, LVM, and etcd to focus on raw I/O. Volumes are automatically striped across every NVMe drive on a node, trading redundancy for maximum throughput so even small workloads inherit the full bandwidth of the underlying hardware. Mirko explains how this transforms Spark shuffles, Fabric staging zones, and AI caches from network‑bound operations into near‑silicon‑speed workloads.The episode demystifies NVMe by contrasting it with traditional cloud storage. Legacy protocols serialize operations through a single lane, while NVMe uses thousands of parallel queues mapped straight to the CPU, turning I/O into a massively concurrent conversation instead of a checkout line. ACStor v2 leverages that design so Fabric and Kubernetes workloads talk to storage like it is part of the server, not a distant service—yielding sub‑millisecond latency and multi‑gigabyte‑per‑second throughput without renting premium SAN capacity.Mirko also tackles practicality and eligibility. He shows where local NVMe disks actually live in Azure—L‑series storage‑optimized VMs, NC‑series GPU machines, and selected D/E series with “temporary” disks—and why ACStor v2 turns those often‑ignored local drives into your primary performance engine instead of a scratchpad. Because NVMe is already baked into the VM price, you stop paying extra for managed speed and start exploiting hardware you already own. He closes with patterns for mapping Fabric lakehouses, Power Platform workloads, and analytic pipelines onto NVMe‑backed storage so your data lake finally moves at the speed your architectures were designed for.WHAT YOU WILL LEARNWhy Fabric and Power Platform workloads feel slow even on powerful compute.How managed storage distance, not bad queries, creates most data‑lake latency.What Azure Container Storage v2 changes by going all‑in on local NVMe disks.How automatic RAID striping across NVMe drives unlocks million‑IOPS performance.Where to find NVMe‑enabled VM families and how to align Fabric workloads to them.THE CORE INSIGHTYour Fabric data lake is not underpowered; it is geographically wrong. As long as your data lives on remote managed storage, you are paying premium prices for CPUs to wait on network trips—move it onto local NVMe with ACStor v2, and the same workloads sprint without changing a single line of code.WHO THIS EPISODE IS FORThis episode is ideal for data engineers, analytics architects, and platform teams running Fabric, Power BI, or Power Platform on Azure who are tired of blaming queries and clusters for problems caused by storage topology. It is especially valuable if you are evaluating new VM families, modernizing lakehouses, or building high‑throughput AI and analytics pipelines and need a concrete, hardware‑aligned strategy to make them actually feel real‑time.ABOUT THE HOSTMirko Peters is a Microsoft 365 and cloud consultant focused on building fast, governed data platforms with Fabric, Azure, and the Power Platform. Through M365.fm, he shares practical performance stories, architecture deep dives, and hardware‑aware patterns that help teams escape slow data lakes and finally match analytical ambitions with the I/O their workloads deserve.Become a supporter of this podcast: https://www.spreaker.com/podcast/m365-fm-modern-work-security-and-productivity-with-microsoft-365--6704921/support.

Fabric data lake performance: in this episode of M365.fm, Mirko Peters explains why your Fabric lakehouse feels slow not because of Spark, Power BI, or engineers—but because your data lives on remote, managed storage that behaves like a networked file share from 2003. He opens with a brutal truth: every query, transform, and dashboard waits on storage latency first, and as long as your bytes commute across Azure’s network to reach compute, you are paying for CPUs to sit idle while I/O negotiations crawl along.He then unpacks how Fabric and Power Platform end up bottlenecked by their own convenience. Managed tiers promise elasticity and durability, but each layer—service fabrics, gateways, redundancy, regional routing—adds milliseconds that quietly stack into minutes on trillion‑row refreshes. Mirko likens managed storage to a postal service: reliable and distributed, but absurd when you are trying to do millisecond analytics. Meanwhile, administrators keep scaling nodes and spark pools, unknowingly feeding a bottleneck that more compute cannot fix because the physics of distance remain unchanged.From there, he introduces Azure Container Storage v2 as the NVMe fix for this drag. ACStor v2 abandons the old, complex design and goes all‑in on local NVMe disks wired directly to the host’s PCIe lanes, stripping out managed disks, LVM, and etcd to focus on raw I/O. Volumes are automatically striped across every NVMe drive on a node, trading redundancy for maximum throughput so even small workloads inherit the full bandwidth of the underlying hardware. Mirko explains how this transforms Spark shuffles, Fabric staging zones, and AI caches from network‑bound operations into near‑silicon‑speed workloads.The episode demystifies NVMe by contrasting it with traditional cloud storage. Legacy protocols serialize operations through a single lane, while NVMe uses thousands of parallel queues mapped straight to the CPU, turning I/O into a massively concurrent conversation instead of a checkout line. ACStor v2 leverages that design so Fabric and Kubernetes workloads talk to storage like it is part of the server, not a distant service—yielding sub‑millisecond latency and multi‑gigabyte‑per‑second throughput without renting premium SAN capacity.Mirko also tackles practicality and eligibility. He shows where local NVMe disks actually live in Azure—L‑series storage‑optimized VMs, NC‑series GPU machines, and selected D/E series with “temporary” disks—and why ACStor v2 turns those often‑ignored local drives into your primary performance engine instead of a scratchpad. Because NVMe is already baked into the VM price, you stop paying extra for managed speed and start exploiting hardware you already own. He closes with patterns for mapping Fabric lakehouses, Power Platform workloads, and analytic pipelines onto NVMe‑backed storage so your data lake finally moves at the speed your architectures were designed for.WHAT YOU WILL LEARNWhy Fabric and Power Platform workloads feel slow even on powerful compute.How managed storage distance, not bad queries, creates most data‑lake latency.What Azure Container Storage v2 changes by going all‑in...

NOW PLAYING

Fabric data lake performance: fix slow workloads with Azure Container Storage v2 and local NVMe for real‑time analytics

0:00 20:58

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

Frequently Asked Questions

How long is this episode of M365.FM - Modern work, security, and productivity with Microsoft 365?

This episode is 20 minutes long.

When was this M365.FM - Modern work, security, and productivity with Microsoft 365 episode published?

This episode was published on October 22, 2025.

What is this episode about?

Fabric data lake performance: in this episode of M365.fm, Mirko Peters explains why your Fabric lakehouse feels slow not because of Spark, Power BI, or engineers—but because your data lives on remote, managed storage that behaves like a networked...

Is there a transcript available for this episode?

Yes, a full transcript is available for this episode. You can read the complete transcript on the episode page.

Can I download this M365.FM - Modern work, security, and productivity with Microsoft 365 episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!