AI - Beyond the Hype Podcast - All Episodes

6

Securing the Substrate: Why AI Without Data Security Is a Breach Waiting to Happen

Sarah and James open the three-part Data Security for AI series with a simple argument: AI is only as trustworthy as the data underneath it.What we coverThe adoption gap: Gartner expects 40% of enterprise apps to embed AI agents by end‑2026 (up from <5%). IBM’s 2025 Cost of a Data Breach Report found 13% of organisations have had an AI-related breach — 97% lacked proper access controls.Structured vs unstructured data: IDC estimates 80–90% of enterprise data is unstructured. Varonis found only 1 in 10 organisations have labelled files, and 88% still have “ghost” accounts. Point a copilot at that estate and every overshared file is exposed.The incident catalogue: Samsung engineers pasting source code into ChatGPT (2023). Microsoft’s AI team exposing 38 TB — via a misconfigured Azure SAS token. DeepSeek’s ClickHouse leak exposing chat histories and API keys (2025).Liability is real: Moffatt v. Air Canada (2024), where the airline argued its chatbot was a separate legal entity — and lost. NYC’s MyCity chatbot.Shadow AI: IBM found shadow-AI breaches cost US$670K more and make up 20% of incidents.Memorisation: Carlini et al. (ICLR 2023) showed models memorise training data based on size, duplication, and prompt context — sensitive data should be treated as eventually leakable.SourcesGartner 40% forecast: https://finance.yahoo.com/news/40-enterprise-apps-embed-ai-181310288.htmlIBM 2025 Cost of a Data Breach: https://www.ibm.com/reports/data-breachIBM analysis (97%, US$670K): https://www.kiteworks.com/cybersecurity-risk-management/ibm-2025-data-breach-report-ai-risks/IDC unstructured data: https://blog.box.com/90-percent-unstructured-dataVaronis 2025 State of Data Security: https://www.varonis.com/blog/state-of-data-security-reportSamsung ChatGPT leak: https://www.pcmag.com/news/samsung-software-engineers-busted-for-pasting-proprietary-code-into-chatgptMicrosoft 38 TB exposure: https://www.wiz.io/blog/38-terabytes-of-private-data-accidentally-exposed-by-microsoft-ai-researchersDeepSeek ClickHouse exposure: https://www.wiz.io/blog/wiz-research-uncovers-exposed-deepseek-database-leakMoffatt v. Air Canada (Forbes): https://www.forbes.com/sites/marisagarcia/2024/02/19/what-air-canada-lost-in-remarkable-lying-ai-chatbot-case/NYC MyCity (The Markup): https://themarkup.org/artificial-intelligence/2024/04/02/malfunctioning-nyc-ai-chatbot-still-active-despite-widespread-evidence-its-encouraging-illegal-behaviorCisco 2024 Privacy Benchmark: https://www.cisco.com/c/dam/en_us/about/doing_business/trust-center/docs/cisco-privacy-benchmark-study-2024.pdfCarlini et al., ICLR 2023: Send us Feedback

Apr 29, 2026

21m
5

The Invisible Architecture: Why Data Modelling Is the Make-or-Break for Enterprise AI

Sarah and James unpack a question most AI programmes never ask early enough: is the data actually modelled? Drawing on recent benchmarks, documented enterprise failures, and hard ROI evidence, they explore why AI accuracy drops to zero without proper data foundations, why 80% of AI projects stall on data — not algorithms — and what leaders can do about it. From the London Whale to Walmart's checkout fiasco, this episode puts data modelling in the language of business risk, competitive advantage, and AI readiness. References:A Benchmark to Understand the Role of Knowledge Graphs on Large Language Model's Accuracy for Question Answering on Enterprise SQL Databaseshttps://arxiv.org/abs/2311.07509The Consequences of Poor Data Quality: Uncovering the Hidden Riskshttps://www.actian.com/blog/data-management/the-costly-consequences-of-poor-data-quality/The Root Causes of Failure for Artificial Intelligence Projects and How They Can Succeedhttps://www.rand.org/content/dam/rand/pubs/research_reports/RRA2600/RRA2680-1/RAND_RRA2680-1.pdf Generative AI Benchmark: Increasing the Accuracy of LLMs ...https://data.world/blog/generative-ai-benchmark-increasing-the-accuracy-of-llms-in-the-enterprise-with-a-knowledge-graph/How a Single Source of Truth for Data Unlocks Growth ...https://vizule.io/single-source-of-truth-data/Is a Semantic Layer Necessary for Enterprise-Grade AI Agents?https://www.tellius.com/resources/blog/is-a-semantic-layer-necessary-for-enterprise-grade-ai-agentsThe Consequences of Poor Data Quality: Uncovering the Hidden Riskshttps://www.actian.com/blog/data-management/the-costly-consequences-of-poor-data-quality/The Impact of Poor Data Quality (and How to Fix It)https://www.dataversity.net/articles/the-impact-of-poor-data-quality-and-how-to-fix-it/Impact of Poor Data Quality on Business Performance: Challenges, Costs, and Solutionshttps://papers.ssrn.com/sol3/papers.cfm?abstract_id=4843991The ROI of Data Modeling ...https://sqldbm.com/blog/the-roi-of-data-modeling-speaking-to-the-c-suite-using-business-metrics/Master Data Management Case Study: Luxury Retail Transformationhttps://flevy.com/topic/master-data-management/case-master-data-management-enhancement-luxury-retailMDM case study: The value of the Golden Record and mastering your datahttps://qmetrix.com.au/case-study/mdm-case-study-the-value-of-the-golden-record-and-mastering-your-data/JPMorgan Chase London Whale C: Risk Limits, Metrics, and Models Send us Feedback

Apr 20, 2026

20m
4

Why Data Observability Matters Before AI Scales

In the first episode of AI - Beyond the Hype, Sarah and James explore why data observability is one of the most overlooked foundations of enterprise AI readiness. They discuss how incomplete, delayed, duplicated, or poor-quality data can quietly undermine dashboards, reporting, and AI outcomes — and why better AI still starts with better data. (Sources: https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/scenarios/cloud-scale-analytics/manage-observability, https://www.ibm.com/think/topics/ai-data-quality)They explain that AI success depends on more than models or tools. Organisations need confidence that data is flowing correctly from operational systems into a central platform for analytics, reporting, and AI use cases. Without strong foundations, AI can create polished outputs built on unreliable information. (Sources: https://cloud.google.com/transform/how-to-build-strong-data-foundations-gen-ai, https://www.mckinsey.com/capabilities/tech-and-ai/our-insights/the-data-dividend-fueling-generative-ai)The episode also unpacks the difference between pipeline monitoring and true data observability. A pipeline may run successfully and still produce untrustworthy data. Observability helps teams detect, diagnose, and prevent issues before they create business impact. (Sources: https://www.databricks.com/blog/what-is-data-observability, https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/scenarios/cloud-scale-analytics/manage-observability)Key takeaways:AI readiness is not the same as AI enthusiasm. Strong data foundations determine what is actually possible. (Source: https://www.mckinsey.com/capabilities/tech-and-ai/our-insights/the-data-dividend-fueling-generative-ai)Source-system data quality should be validated early, with ongoing checks for completeness, accuracy, and uniqueness. (Source: https://docs.aws.amazon.com/wellarchitected/latest/analytics-lens/best-practice-1.1---validate-the-data-quality-of-source-systems-before-transferring-data-for-analytics..html)Poor data quality is one of the most common reasons AI initiatives fail. (Source: https://www.ibm.com/think/topics/ai-data-quality)Why this matters:For leaders, this is not just a technical issue. It is a question of trust, decision quality, governance, and risk. If the data underneath reporting and AI is weak, faster systems can simply produce faster bad answers. (Sources: https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/scenarios/cloud-scale-analytics/manage-observability, https://www.ibm.com/think/topics/ai-data-quality)Memorable taSend us Feedback

Apr 14, 2026

12m
3

AI - Beyond the Hype - Trailer

Welcome to AI - Beyond the Hype — a podcast for executives, technology leaders, and data teams who want a clearer, more practical conversation about what it really takes to make AI work in the enterprise. In this short trailer, Sarah and James introduce the show and explain why data quality, observability, governance, and trust matter just as much as the AI itself.Send us Feedback

Apr 13, 2026

2m

View all 6 episodes →

Type above to search every episode's transcript for a word or phrase. Matches are scoped to this podcast.

Searching…

No matches for "" in this podcast's transcripts.

Showing of matches

No topics indexed yet for this podcast.

Loading reviews...

Share your thoughts

ABOUT THIS SHOW

AI - Beyond the Hype is a podcast for senior executives, technology leaders, and data professionals who want a clear-eyed view of what it really takes to make AI work in the enterprise.Each short episode is designed for easy consumption by busy leaders and executives, offering concise, practical conversations on the foundations behind successful AI adoption — from data quality and observability to governance, operating models, architecture, and trust. Through thoughtful, conversational dialogue, the show connects executive priorities with the technical realities that determine whether AI delivers meaningful value or simply creates more noise.If your organisation is asking big questions about AI readiness, digital transformation, and data-driven decision-making, this podcast is designed to help you quickly separate what sounds impressive from what actually works.

HOSTED BY

Sara, James & Darryl

Securing the Substrate: Why AI Without Data Security Is a Breach Waiting to Happen

The Invisible Architecture: Why Data Modelling Is the Make-or-Break for Enterprise AI

Why Data Observability Matters Before AI Scales

AI - Beyond the Hype - Trailer

Authentication Required