PODCAST · technology

The Weight Update

by Kris Moore

AI intelligence for technology leaders. Model releases, infrastructure decisions, governance deadlines, vendor shifts, and talent signals — analyzed with evidence, delivered with opinion.Each episode covers what changed this week in AI and what it means for your organization's strategy. Built for CTOs, CIOs, VPs of Engineering, and Heads of AI/ML who need to make decisions, not just stay informed.AI-Assisted Production: Research and editorial direction by Kristopher Moore. Scripts developed with Claude (Anthropic). Narration by AI voice synthesis (Microsoft Edge TTS).

Subscribe · 0 Bookmark

12

Where the Margin Showed Up

Three procurement-relevant events landed inside eight days. The Wall Street Journal scoops the OpenAI revenue and weekly-active-user miss with same-day market reaction across Oracle, AMD, Broadcom, NVIDIA, and SoftBank. The UK AI Safety Institute publishes the first independent third-party measurement that puts a generally-available model — GPT-5.5 — in the same cyber-capabilities band as Claude Mythos Preview, with overlapping confidence intervals. And the harness layer underneath both — five Claude Code releases in five days, a new persistent-goal primitive in Codex, and the open-research harness class crossing into real procurement viability for the first time. The W18 frame: revenue stress at the top of the proprietary stack, harness consolidation while open-research catches up, and a third-party evaluation that challenges the access-control argument the leading lab has been using to justify its restricted release tier. Plus a one-act handoff on the Musk-Altman trial (full treatment on The Guardrail this week), the AI Feature Tracker, and five Monday-morning principles for CTOs.AI Disclosure: This episode was produced with AI assistance. Research synthesis and script writing used Claude (Anthropic) under human editorial direction. Audio narration by Microsoft Edge TTS (en-US-AndrewNeural voice).

May 4, 2026

58m
11

Where the Margin Moved

Four major labs moved list prices up in April. Two open-weight shops moved prices down in the same eight days. Capability commoditized at the top of the leaderboard while unit economics diverged in three directions underneath. DeepSeek V4 shipped as the first serious frontier-class open-weight model trained without CUDA as a required dependency. SpaceX and Cursor announced a compute partnership with a 60-billion-dollar acquisition option attached. This episode walks where margin is actually being defended (subscription and scope, not per-token rate), why monthly release cadence is now possible (sparse-RL consensus across four independent research groups), and what the week changes for Monday-morning procurement — plus the debut of the AI Feature Tracker recurring segment.AI Disclosure: This episode was produced with AI assistance. Research synthesis and script writing used Claude (Anthropic) under human editorial direction. Audio narration by Microsoft Edge TTS (en-US-AndrewNeural voice).

Apr 25, 2026

49m
10

The Trust Crisis

Three arcs: (1) Opus 4.7 + the nerfing narrative + Mythos/Glasswing consortium capability decoupling — led by the AMD Senior Director telemetry case (GitHub #42796, 6,852 sessions, Pearson 0.971 correlation to redaction rollout, 125x cost spike); (2) Antigravity vs Codex 2026 vs Cursor vs Windsurf — marketshare/mindshare divergence (Cursor $2B ARR) and fit-for-task patterns; (3) Models past code — FrontierScience Olympiad 77% vs Research 25% gap, benchmark saturation, custom silicon inflection (Maia 200, Trainium 3, TSMC 3nm bottleneck). Thesis: "Trust Crisis" = capability-vs-served-behavior decoupling. Cross-show pair with Guardrail Ep 8.AI Disclosure: This episode was produced with AI assistance. Research synthesis and script writing used Claude (Anthropic) under human editorial direction. Audio narration by Microsoft Edge TTS (en-US-AndrewNeural voice).

Apr 21, 2026

59m
9

Everybody Shipped

A wide-aperture survey of the most concentrated AI news cycle of Q1 2026. In fourteen days: Meta launched Muse Spark under Alexandr Wang and walked away from the open-weight default that defined Llama. Zhipu shipped GLM-5.1, a frontier-class open-weight coding model trained end-to-end on Huawei Ascend silicon with zero NVIDIA in the stack. Anthropic unveiled Claude Mythos Preview via Project Glasswing — seeded to eleven named enterprise defensive partners (AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, NVIDIA, Palo Alto Networks) and explicitly declined GA release over cybersecurity dual-use risk. NVIDIA put Vera Rubin into production at 2,300 watts per GPU with mandatory liquid cooling. OpenAI killed Sora because the unit economics didn't work and redirected the compute to Codex and enterprise agents. Google went GA with Ironwood, the seventh-generation TPU. MemPalace v3.0 hit 21,700 GitHub stars in four days claiming the top of the LongMemEval benchmark (amid significant community skepticism about the benchmark methodology and one of the two named creators' actual technical involvement). Kimi K2.5 cut its input price again.This episode walks the field lab by lab and chip by chip — US frontier, Chinese open-weights wave, silicon, coding agents, the memory layer — and closes with what got heavier and what got lighter for a CTO making vendor decisions right now. Three Forward Look predictions are logged for accountability.Honest about which claims are vendor self-reports and which are independently verified. Two single-source claims (GLM-5.1 SWE-Bench Pro 58.4, MemPalace LongMemEval 96.6%) are flagged in-episode as pending independent reproduction.Runtime: 58 minutes. Coverage window: 2026-03-26 to 2026-04-08.---AI Disclosure: This episode was produced with AI assistance. Research synthesis and script writing used Claude (Anthropic) under human editorial direction. Audio narration by Microsoft Edge TTS (en-US-AndrewNeural voice).edited to fix TTS defect.

Apr 10, 2026

58m
8

The Practitioner's Guide to TurboQuant

KV cache compression on your own hardware: what works, what doesn't, and when to care.Google's TurboQuant paper compresses KV cache to 3 bits per coordinate — 6x memory reduction, 8x faster inference, zero accuracy loss, no retraining required. This deep-dive walks through what it actually is, the three-layer compression stack, real benchmark results on a consumer RTX 4090, community implementations available today, the Hugging Face ecosystem integration, and a CTO decision framework for when this matters to your org. Companion to the LinkedIn article of the same name.25 sources cited. Full source list in show notes.AI Disclosure: This episode was produced with AI assistance. Research synthesis and script writing used Claude (Anthropic) under human editorial direction. Audio narration by Microsoft Edge TTS (en-US-AndrewNeural voice).

Apr 3, 2026

25m
7

The Security Inflection

Mythos changes the threat model, three agent runtimes compete, OpenAI kills Sora, and four compliance deadlines land in four months.A leaked frontier model codenamed Mythos revealed AI-driven cyberattack capabilities that compress vulnerability exploitation from days to hours. Three agent runtimes are now competing for the enterprise stack. OpenAI shut down Sora. Private credit markets are reshaping AI infrastructure financing. And four compliance deadlines — Colorado AI Act, EU AI Act transparency, NIST agent standards, and the Pentagon's 30-day deployment directive — all land within four months.123 sources cited. Full source list in show notes.AI Disclosure: This episode was produced with AI assistance. Research synthesis and script writing used Claude (Anthropic) under human editorial direction. Audio narration by Microsoft Edge TTS (en-US-AndrewNeural voice).

Apr 3, 2026

41m
6

Follow the Money

Short Description:What $700 Billion in AI Spending, $16 Billion in Insider Selling, and a $2 Trillion IPO Pipeline Tell Us About What may Come NextEpisode Description:This special edition synthesizes the four-part "Follow the Money" article series into a single audio narrative. It steel-mans the paradise narrative, traces the financial stress points through the doom loop, and lands on the only honest conclusion: neither should change what a well-run technology organization does tomorrow.Topics covered: $16B insider selling in 2025, Oracle's $108B debt and $300B OpenAI partnership, $80-95B annual stock-based compensation across Big 5 hyperscalers, the $2.9T IPO pipeline, underwriter conflicts (Goldman as both Anthropic investor and OpenAI IPO underwriter), 2008 structural parallels, rate scenarios, SBC death spiral mechanics, 5 hedging strategies for CTOs and boards, and ecosystem lock-in dynamics.Companion articles on LinkedIn: Follow the Money Parts 1-4.AI-Assisted Production: Research and editorial direction by Kristopher Moore. Scripts developed with Claude (Anthropic). Narration by AI voice synthesis (Microsoft Edge TTS, en-US-AndrewNeural). All content is human-directed and editorially reviewed.

Mar 27, 2026

46m
5

The Agent Security Reckoning

AI agent capability has dramatically outpaced AI agent security. Over 1,184 malicious skills were found in the OpenClaw ecosystem, 135,000 instances were publicly exposed with zero authentication, and the CVE list grew to four critical vulnerabilities in weeks. Simultaneously, the most complex AI compliance environment in history emerged from a three-way collision between federal preemption, state laws, and EU enforcement. Defense AI crossed from pilots to permanent institutional deployment with Anduril's $20B Army contract and Palantir's Maven program of record, while three frontier labs shipped autonomous desktop agents in the same month.AI Disclosure: This episode was produced with AI assistance. Research synthesis and script writing used Claude (Anthropic) under human editorial direction. Audio narration by Microsoft Edge TTS (en-US-AndrewNeural voice).

Mar 25, 2026

45m
4

What Model, Where, At What Cost — The Three Decisions That Define Your AI Stack

Instead of the usual news roundup, this episode walks through the three decisions every technology leader deploying AI in 2026 needs to articulate: which model, where to run it, and which harness wraps it.The model landscape now includes 7+ serious contenders across 4 countries, with a 36x price spread between frontier and budget tiers. The inference provider market has fragmented into four tiers — direct API, custom silicon (Groq, Cerebras, SambaNova), GPU-optimized (Fireworks, Together), and self-hosted. And the most important finding in AI tooling this year: harness design drives 22% of performance variance, while model selection drives just 1%.Three worked scenarios show how these decisions compound: AI coding assistants, customer-facing agents, and batch processing pipelines — with real pricing and architecture trade-offs for each.The episode splits at the 40-minute mark. The first half is the framework for your next board meeting or leadership discussion. The second half is detailed data — model-by-model pricing, provider-by-provider throughput, tool-by-tool comparison — for the technical leads on your team who need to build the evaluation.Plus: GTC preview, Oracle's $50B infrastructure raise, defense AI hiring data, and the 90-day trajectory for multi-model routing, custom silicon adoption, and harness convergence.38 sources cited. Full source list in show notes.AI Disclosure: This episode was produced with AI assistance. Research synthesis and script writing used Claude (Anthropic) under human editorial direction. Audio narration by Microsoft Edge TTS (en-US-AndrewNeural voice).

Mar 12, 2026

28m
3

GPT-5.4 and the hardware wall

GPT-5.4 just dropped with superhuman computer use and million-token context — but it doesn't win everything. Claude leads coding, Gemini leads reasoning. The era of one best model is over.Meanwhile, DeepSeek V4 is stuck because Huawei Ascend chips can't handle frontier training. Only 8 models have ever trained on Huawei from scratch, and 5 of them are from Huawei or its closest partners. Next-gen Huawei chips will be weaker than current ones (built on smuggled TSMC dies), and domestic memory production caps at ~300K accelerators per year.Plus: the distillation threat (24K fake accounts mining Claude), Qwen 3.5's leadership exodus, GLM-5's real-world limitations, OpenAI's record $110B funding, BlackRock's $40B data center acquisition, and a 96% pricing collapse in 3 years.

Mar 6, 2026

43m
2

The Safety Paradox

The company built to make AI safe just got labeled a national security threat. Plus: MCP's 97 million downloads have a massive security hole, NVIDIA hits $68B but can't get enough memory chips, three compliance deadlines are about to collide, and Block just blamed AI for cutting 40% of its workforce.

Mar 4, 2026

48m

View all 12 episodes →

Type above to search every episode's transcript for a word or phrase. Matches are scoped to this podcast.

Searching…

We're indexing this podcast's transcripts for the first time — this can take a minute or two. We'll show results as soon as they're ready.

No matches for "" in this podcast's transcripts.

Showing of matches

No topics indexed yet for this podcast.

Loading reviews...

Share your thoughts

ABOUT THIS SHOW

HOSTED BY

Kris Moore

Where the Margin Showed Up

Where the Margin Moved

The Trust Crisis

Everybody Shipped

The Practitioner's Guide to TurboQuant

The Security Inflection

Follow the Money

The Agent Security Reckoning

What Model, Where, At What Cost — The Three Decisions That Define Your AI Stack

GPT-5.4 and the hardware wall

The Safety Paradox

Authentication Required