COEY Cast Podcast - All Episodes

180

Open Source Vibe Check with VibeVoice and MOSS Audio

Microsoft's open source VibeVoice puts real pressure on audio workflows with multilingual transcription, speaker tracking, timestamps, and long context that can turn recordings into searchable assets. MOSS Audio adds a broader layer of audio understanding with emotion cues, music recognition, sound events, and time aware analysis that could help media teams mine podcasts, calls, ads, and live recordings for actual insight. Then Eva Brain enters with a bigger question for marketers: which parts of campaign management can agents really handle, and where do humans still need to lead? The bigger takeaway is simple. The model matters, but the workflow matters more when teams want automation that is useful, reliable, and still grounded in human judgment.

May 1, 2026

179

Open Up: Nemotron, LLM jp 4, and Laguna

Open models are having a real moment, and this trio shows why. NVIDIA Nemotron 3 Nano Omni points to simpler multimodal workflows by handling text, image, audio, and video in one stack. LLM jp 4 shows how regional open models can beat bigger global names when language, culture, and local context actually matter. Poolside Laguna brings the coding angle, but the bigger story is automation infrastructure for marketing teams that need custom tools, connectors, and internal workflows. The takeaway is practical: open can mean more control, flexibility, and lower lock in, but it also means more responsibility. Better systems win, especially when humans stay in the loop where judgment and brand risk matter most.

Apr 30, 2026

178

OpenAI GPT 5.5 Ships Quietly, Workflows Loudly

OpenAI dropped GPT 5.5 into the API with a huge context window, stronger reasoning, and deeper tool use, and the bigger story is how fast teams can put it to work. This covers why quiet launches matter more than flashy keynotes when marketers, creators, and operators need real workflow gains. It also digs into where automation actually helps first, from research briefs and call note synthesis to support flows with clean guardrails. ElevenLabs adds voice agent templates that make testing easier, while MiniMax Music 2.6 lowers the cost of experimenting with AI audio. The throughline is simple: AI is getting less performative and more operational, and the winners will be teams that ship practical systems with humans still making the calls.

Apr 28, 2026

177

Audio Flamingo Next and the Rise of Specialist AI

AI is getting less monolithic and more specialized, and that shift matters for anyone building real workflows. OpenAI’s GPT Rosalind signals that domain specific models are becoming a serious enterprise play. Higgsfield’s sci fi pilot shows AI video is pushing past flashy clips into longer form storytelling and faster pre production. NVIDIA’s open Audio Flamingo Next points to practical wins in podcast mining, searchable archives, call review, and media repurposing. The throughline is simple. General models still help orchestrate the stack, but specialist systems are where trust, depth, and format specific performance start to matter most. The real advantage comes from designing around recurring jobs, not chasing every shiny model release.

Apr 19, 2026

176

Microsoft Foundry Gets Voice, Images, and Transcripts

Microsoft just bundled MAI Transcribe 1, MAI Voice 1, and MAI Image 2 into Foundry, giving teams one place to handle transcription, synthetic voice, and image generation inside enterprise workflows. That sounds convenient, and it is, but it also raises the classic question of speed versus lock in. The conversation also digs into Audio Omni and why unified audio models could become real creative partners for editing, localization, sound design, and campaign iteration. Then it shifts to the less flashy but more important layer of AI adoption: rights, provenance, royalties, and governance. The real advantage is not stacking more models. It is building workflows that stay modular, accountable, and useful when real teams have to ship.

Apr 17, 2026

175

Closed Doors and Voice Chords with Claude and Gemini

Claude Opus 4.7, Gemini 3.1 Flash TTS, and GPT 5.4 Cyber all point to the same shift: AI tools are getting shaped around real jobs, not just flashy demos. This conversation breaks down what better instruction following, vision handling, task controls, and voice direction actually mean for creative ops, marketing teams, and automated content pipelines. The bigger takeaway is that stronger models do not fix weak workflows. Teams still need clear approvals, human judgment, and guardrails that keep fast production from turning into fast chaos. Open versus closed also stays in play as companies weigh convenience, privacy, portability, and control while building model stacks that fit how work actually gets done.

Apr 16, 2026

174

Open Source, Open Questions: MiniMax, Orion1, LTX 2.3

Open models are growing up fast, and that changes how teams should build with AI. MiniMax 2.7 puts fresh pressure on closed platforms with strong coding, long context, and agent potential, but the real story is workflow portability and avoiding vendor lock in. Orion1 pushes multilingual speech recognition forward with better support for lower resource languages, opening new possibilities for transcription, localization, and audience reach. OpenArt LTX 2.3 makes open video more practical for social content with better prompt adherence, smoother motion, audio sync, and portrait output. The bigger takeaway is simple. Date the model, marry the workflow, and keep humans in the loop where judgment still matters most.

Apr 14, 2026

173

Meta’s Muse Spark and the Closed AI Workflow Play

Meta’s Muse Spark is the clearest sign that AI is shifting from flashy demos into real workflow. The bigger move is not just the model itself, but the way Meta dropped it straight into the apps people already use all day. That makes Muse Spark immediately relevant for marketers and creators working across Facebook, Instagram, Messenger, WhatsApp, and more. The conversation also tracks why Google DeepMind’s Lyria 3 Pro matters for production ready audio, and why Wan 2.7 is getting attention for practical video controls like motion transfer and multi image guidance. The common thread is simple. AI is getting less theatrical and more useful for teams trying to ship content faster with humans still making the calls.

Apr 13, 2026

172

OpenClaw and Open Models: Build Your Own AI Stack?

Open source AI is moving from hobbyist flex to serious business infrastructure. DeepSeek's open models, OpenClaw, and new voice systems are pushing teams to ask a bigger question: should you own your AI stack instead of renting one through a closed API? This conversation breaks down what that shift actually means for marketers, creators, and operators. Private deployment, multilingual generation, branded voice workflows, and multi agent automation all sound great until governance, security, evals, and process design show up. The real opportunity is not just cheaper models. It's more control, better customization, and workflows built around your team. The real risk is automating chaos faster if humans are not steering the system.

Apr 12, 2026

171

Seedance in the Fast Lane, Plus Happy Horse and Music 2.6

ByteDance's Seedance 2.0 is making AI video feel ready for real production, not just flashy demos. The big shift is workflow. Teams can move from concept to scene direction, motion, and synced audio in one place with fewer handoffs and fewer weird artifacts. That changes how marketers, creators, and media teams think about speed, approvals, and risk. The conversation also looks at Alibaba's Happy Horse as a stealth video contender, why open versus closed systems still matters for automation strategy, and how MiniMax Music 2.6 is pushing audio AI toward editing, continuation, and usable post production. The takeaway is simple. Better models matter, but better workflows are what actually ship.

Apr 12, 2026

170

Veo Goes Wide, ElevenLabs Locks Down, Firefly Cleans Up

Google is opening Veo 3 to more teams and introducing Veo 3.1 Light as a lower cost path to cinematic video generation. That matters less for pure wow factor and more for iteration, pre production, and testing creative directions before a full shoot. ElevenLabs is pushing voice AI deeper into enterprise with on prem and on device options that make privacy, compliance, and localization more realistic for serious buyers. Adobe Firefly rounds out the story with tools built for control, brand safety, and high volume asset production. The bigger theme is simple. AI wins when it fits the workflow. Better models help, but judgment, review, transparency, and usable handoffs still decide what actually ships.

Apr 10, 2026

169

Open Source Goes Long with GLM 5.1

Z.ai's open source GLM 5.1 is pushing the AI conversation past chat and straight into workflow automation. The model promises long horizon autonomy, tool use, structured outputs, and the kind of persistence that could make open models far more useful for creators, marketers, and operators. That does not mean teams should hand over the keys. The real opportunity is using strong open models for recoverable, high volume tasks while keeping humans at critical decision points. The conversation also digs into Anthropic's Claude Managed Agents and Google's Lyria 3 to show where agent infrastructure and AI audio are getting more practical, and where taste, oversight, and process design still matter most for getting real work done.

Apr 9, 2026

168

Mythos, Meta, and CREATUS Walk Into a Workflow

Anthropic’s Claude Mythos Preview arrives as a locked down frontier model, and the restrictions are almost as important as the benchmarks. That raises a bigger question about trust, safety, and whether teams should bet on closed systems or wait for open alternatives. Meta keeps pushing ad automation with Andromeda and Advantage+, shifting media buying away from manual controls and toward creative volume, structured variation, and stronger human judgment. CREATUS.AI brings a more practical video update with better motion, lip sync, longer clips, and audio driven generation that could actually speed up production. The throughline is simple: AI is getting more useful when it fits real workflows, not just flashy demos.

Apr 8, 2026

167

Netflix VOID Fills the Gap in Post

Netflix just open sourced VOID, a video inpainting model built for a very real production problem: fixing footage that is almost usable. The conversation digs into how VOID removes people or objects and rebuilds motion, shadows, and scene logic so shots feel physically believable instead of obviously patched. That makes it a serious tool for editors, agencies, and brand teams trying to avoid reshoots, roto work, and cleanup chaos. The bigger takeaway goes beyond one model. AI video is shifting from flashy demo culture to workflow utility, where consistency, speed, and cost matter more than cinematic bragging rights. There is also a real trust question here, because better cleanup tools make provenance, disclosure, and human review a lot more important.

Apr 7, 2026

166

OpenAI Goes Full Operator While Veo 3 Joins the Workflow

OpenAI rumors are pointing toward bigger context windows, stronger computer use, and a real shift from chatbot novelty to agent execution. That matters less for benchmarks and more for actual work like research, planning, browser tasks, reporting, and handoffs. Google Veo 3 is also pushing video generation closer to business workflows through Google Vids, which could change how teams make explainers, ads, and internal content. On the audio side, ElevenLabs and Suno are making voice and music creation more practical for brands. Open source options like OpenClaw add another layer by giving teams more control, privacy, and flexibility. The real advantage is not model access. It's having strong systems, clear taste, and humans staying in the loop.

Apr 6, 2026

165

Gemini Talks, Shopify Sells, Runway Rolls Camera

Google’s Gemini 3.1 Flash Live pushes voice from demo mode toward real workflow use, with faster conversation, better turn taking, and stronger multimodal context. Shopify’s agentic storefront shift means product catalogs now need to work for AI shopping assistants, not just human buyers, making clean metadata and structured commerce a real advantage. Runway Gen 4.5 keeps moving AI video closer to production with better multi shot consistency, native audio momentum, and faster iteration for creative teams. The bigger theme is simple: voice, shopping, and video are all becoming operational systems. The winners will not be the teams chasing hype. They will be the ones pairing strong creative judgment with clean automation, better data, and human oversight where it counts most.

Apr 5, 2026

164

Microsoft MAI Models: Transcribe First, Then Scale

Microsoft just dropped three new Azure AI Foundry models, and the big takeaway is simple: transcription may be the real winner. MAI Transcribe 1, MAI Voice 1, and MAI Image 2 signal a bigger shift toward treating language, voice, and visuals as workflow infrastructure instead of novelty features. The conversation breaks down why transcription creates the fastest payoff for creators and marketers, where synthetic voice actually helps, and why better image text rendering matters for ads, mockups, and branded assets. It also looks at the tradeoffs between polished closed platforms and open source options like OpenClaw, with a practical case for hybrid stacks that keep humans focused on judgment while automation handles the repetitive middle.

Apr 4, 2026

163

Gemma 4 Goes Open While AI Marketers Get Weird

Google DeepMind just made Gemma 4 a real conversation by releasing it under Apache 2.0, and that changes how teams think about building with open models. This covers what Gemma 4 means for commercial use, local deployment, private workflows, and where smaller controllable models actually fit inside a production stack. It also gets into the Claude Code source exposure and why the bigger lesson is not drama but operational discipline. Then the focus shifts to the rise of the so called AI marketer, where agents can research, draft, monitor, and optimize but still cannot replace strategy, taste, or accountability. The real shift is from flashy copilots to systems that can carry work across steps.

Apr 3, 2026

162

Open Source Roars as AI Video Gets Pipeline Ready

AI video is getting close to real campaign use, but the bigger shift is what that does to creative workflows. Higgsfield Cinema Studio 3.0 pushes better character consistency and multi shot scenes, while Google Veo 3.1 Lite makes video generation cheaper, faster, and easier to plug into production systems. On the audio side, open source LongCat AudioDiT points to a future where voice becomes core infrastructure for localization, dubbing, and content scaling. The real advantage is not just prettier outputs. It is faster concepting, tighter feedback loops, stronger review systems, and human judgment baked into every step. Pretty clips are easy now. Useful, persuasive, on brand content still needs people in the loop.

Apr 2, 2026

161

Open Voice, Multi Shot, and Google’s AI Music Push

Google’s Lyria 3 Pro, Runway’s Multi Shot App, and Mistral’s open weights text to speech model all point to the same shift. Audio, video, and voice are becoming programmable workflow layers for creators and marketers. That opens the door to faster campaign concepts, localized narration, branded audio, and more efficient content production. It also raises bigger questions around taste, governance, approvals, and whether teams are making better work or just more of it. The real advantage is not having one more AI toy. It is building a stack that supports strategy, review, and brand consistency while keeping humans in the loop where judgment still matters most.

Apr 1, 2026

160

Open Qwen, Closed Loop: Multimodal Gets Real

Alibaba’s open Qwen 3.5 Omni is pushing multimodal AI past flashy demos and closer to real workflow value. Voice, camera input, long audio context, and fast generation are starting to look less like chatbot features and more like a new interface for building drafts, prototypes, and internal tools. The bigger question is where this actually works for teams with approvals, brand rules, and security needs. The conversation also maps the rise of practical AI video through Kling 3.0, Dreamina, and Seedance 2.0, plus why Intercom’s Fin Apex 1.0 may be the clearest sign of how enterprises will really buy AI. The takeaway is simple. Route the right work to the right model and keep humans on taste, trust, and decisions.

Mar 31, 2026

159

OpenClaw or Open Chaos? The Open Source Agent Reality

OpenClaw is getting hyped as an open source agent framework that can handle content, scheduling, asset creation, memory, and workflow coordination for lean teams. The real story is less about replacing your whole marketing team and more about building systems that can manage repeatable work without creating automated chaos. Persistent context, reusable templates, Telegram based coordination, and local model setups all sound powerful, but they still need strong briefs, clear approvals, and actual human judgment. The payoff is faster operations for monitoring, triage, summaries, and first drafts. The risk is scaling bland output or messy workflows. Automation can remove the boring middle, but brand voice, taste, and strategy still need humans in the loop.

Mar 30, 2026

158

Gemini Flash Live and the Great AI Workflow Reality Check

Google is pushing Gemini 3.1 Flash Live into real time voice and camera workflows, and that makes one thing clear. Voice AI is becoming a real interface layer for brands, not just a flashy demo. The bigger question is where it actually works. Customer triage, guided commerce, multilingual support, and structured actions look promising. Emotional nuance, messy edge cases, and brand risk still need people. The conversation also turns to Z.ai's GLM 5.1 and why lower cost models are putting real pressure on premium AI pricing. Add Snapchat and Google building generative tools deeper into ad platforms, and the shift is obvious. AI is moving from magic trick to workflow infrastructure, with humans still steering the ship.

Mar 29, 2026

157

Open Mic Night for AI: Covo, Cohere, and NotebookLM

Audio just stopped being a side feature and started looking like core workflow infrastructure. This conversation tracks three big signals behind that shift. Tencent’s open Covo Audio pushes toward more natural voice interaction with lower latency and better interruption handling. Cohere’s open speech recognition model could unlock cheaper, faster transcription for meetings, podcasts, support, and multilingual operations. NotebookLM is also stretching beyond research and into narrated video creation, collapsing steps that used to live across multiple tools. The real question is not which demo looks coolest. It is where automation actually removes friction while keeping humans close to judgment, brand, accuracy, and risk. That is where creators, marketers, and operators get real leverage.

Mar 28, 2026

156

Spud, Mythos, and the Rise of AI Campaign Operators

Leaked frontier model chatter is loud, but the bigger story is workflow. Anthropic's rumored Claude Mythos, also called Capybara in some corners, and OpenAI's rumored Spud show how fast the model race keeps shifting. The real takeaway for creators and marketers is not to rebuild around leaks or wait for the next flagship. It is to build model agnostic systems with evals, guardrails, and human review. Klaviyo Composer, Pomelli, and RogIQ show where things are heading as AI moves from chat assistant to campaign operator. That shift can remove the boring middle of briefs, asset versions, and reporting, but taste, judgment, and brand differentiation still need a human hand on the wheel.

Mar 27, 2026

155

Open Mic Night: Lyria, PrismAudio, and Mistral

Audio just jumped from nice to have to workflow priority. Google’s Lyria 3 Pro pushes AI music closer to usable campaign assets with longer, more structured tracks that fit real production needs. Open source PrismAudio tackles one of post production’s most annoying problems by matching sound effects and environmental audio to what is actually happening on screen. Mistral adds another important signal with open speech, giving teams more control over voice pipelines, localization, and costs. The bigger story is not just better demos. It is how brands, creators, and media teams build smarter audio systems that save time while keeping humans in charge of taste, trust, approvals, and final creative judgment.

Mar 26, 2026

154

Claude Clicks, Mistral Opens, and AI Gets to Work

Claude is moving from chatbot to operator, and that changes the automation conversation fast. This covers what Anthropic's computer use push really means for teams, where AI agents can save time today, and why brittle workflows still break the fantasy of full autonomy. It also digs into Microsoft's MAI Image 2 and why better text rendering matters more than flashy demos for marketers who need usable creative assets. Then it zooms out to Mistral's open weight momentum, why open models matter for control and multilingual workflows, and where the tradeoffs get very real. The through line is simple: machine action is getting better, but smart workflow design and human judgment still decide whether automation creates leverage or chaos.

Mar 25, 2026

153

Luma UNI1, Pika 2.2, and ElevenLabs Raise the Floor

Luma UNI1, Pika 2.2, and ElevenLabs all point to the same shift: polished creative output is getting faster, cheaper, and easier to slot into real workflows. Better image reasoning means fewer prompt gymnastics and more usable first drafts. Stronger short form video generation makes social production more reliable at the speed marketing teams actually need. ElevenLabs is pushing beyond voice tools into marketplace infrastructure, which puts creator payouts, licensing, and trust at the center of the audio conversation. The real story is not magic buttons. It is what happens when content generation gets easy and the bottleneck moves to taste, approvals, governance, and process. The winners will be teams with cleaner systems and better human judgment.

Mar 24, 2026

152

MiniMaxed Out: Open Weights, Agent Teams, and AI Ads

Open weights are getting real, agent teams are getting practical, and generative interfaces are starting to shape ad distribution. This conversation tracks why MiniMax M2.7 matters beyond benchmark hype, especially for teams that want more control over internal workflows without betting everything on one closed vendor. It also breaks down where multi agent coding systems like Codex can actually help and where they just create expensive digital meetings. The bigger shift is what happens when the same AI interface that helps create content also decides what gets surfaced to users. For creators, marketers, and media operators, the advantage comes from building structured workflows, keeping humans near judgment calls, and staying machine readable without becoming generic.

Mar 23, 2026

151

Open Source Gets Real with LTX 2.3, Rakuten, and Kitten

Open source AI had a very loud week. LTX 2.3 moved closer to real production use with an API for image to video and native portrait output, which matters a lot for social teams and ad workflows. Rakuten AI 3.0 added fuel to the regional model conversation with a large Japanese release that raises useful questions about localization, transparency, and what counts as real innovation. Kitten TTS showed how small voice models can bring text to speech to browsers, CPUs, and lower cost products. The bigger takeaway is simple. Better models are not the whole game. Workflow design, human review, and operational sanity are what turn open tools into something a team can actually use.

Mar 22, 2026

150

Open Source or Closed? AI Workflow Winners This Week

Google DeepMind is testing Deep Think in Gemini 2.5 Pro, Ollama 0.7 makes OpenClaw easier to run locally, and Typeface is pushing deeper into governed marketing orchestration. The bigger story is not which demo looks smartest. It is which setup actually helps teams ship better work with fewer surprises. This conversation breaks down where stronger reasoning helps, where local open source stacks save money but add ops overhead, and where commercial platforms earn their keep with approvals, compliance, and brand control. For creators, marketers, and media operators, the takeaway is simple: automate prep, QA, routing, and research, but keep humans close to final judgment, sensitive messaging, and anything that can create brand risk.

Mar 22, 2026

149

Open Nemotron and the Rise of Fast, Cheap AI Creation

NVIDIA’s open Nemotron models are pushing a big shift in AI workflows by making stronger reasoning and agent behavior more affordable. That matters for teams building automations for briefs, reporting, planning, and content operations without burning budget on every step. At the same time, Dreamverse is showing how ultra fast video generation changes creative work from prompt and wait to steer and refine. Rebel Audio adds another piece to the stack by lowering the friction for smaller teams that want useful audio without heavy production overhead. The bigger story is not unlimited automation. It is how open models, faster media tools, and lighter production pipelines raise the value of taste, review, governance, and human judgment across modern creator and marketing workflows.

Mar 21, 2026

148

Open Source Ears, Real Time Eyes

Runway is pushing AI video toward real time creation, with reported sub 100 millisecond response that could turn generation from a waiting game into a live creative tool. SkyReels V4 shows a different shift, where video models start looking more like usable software with benchmarks, pricing, multimodal inputs, and native audio. QuarkAudio adds the open source angle, pointing to a future where audio cleanup, separation, and voice tasks get less fragmented and more flexible. The bigger takeaway is not full autonomy. It is modular workflow design. Faster models move the bottleneck from rendering to judgment, approvals, brand safety, and taste. Human direction still matters most when automation makes endless iteration cheap.

Mar 20, 2026

147

Open, Fast, Loud: Runway, SkyReels, and QuarkAudio

Runway just showed real time video generation with sub 100 millisecond first frames, and that changes how creative teams iterate. Video starts to feel less like rendering and more like steering. SkyReels V4 pushes the conversation further with multimodal video, native audio, and production minded packaging that looks more like software than a research demo. Then Alibaba’s open source QuarkAudio brings a practical shift to audio workflows with one model aimed at cleanup, separation, and voice tasks. The real takeaway is not full autonomy. It is modular automation. Faster tools move the bottleneck from rendering to judgment, approvals, brand safety, and taste. The winners will be teams that pair speed with strong systems and humans who know what good looks like.

Mar 20, 2026

146

Are Agents Growing Up or Just Getting Louder

Agents are leveling up, but are they actually ready to own parts of your workflow yet. This episode breaks down MiniMax M2.7, Moonshot Kimi K2.5, and Langflow 1.8 with IBM Agentics through the lens of real marketing and creator use cases. Learn what self improving agents should actually handle, how to treat long context as a bigger desk not a better brain, and why map reduce generate patterns matter. Get a clear do not automate list for brand risk, compliance, and taste while still using agents to crush prep work, tagging, research, and daily briefs.

Mar 20, 2026

145

Sub Agents and Safe Chaos in GPT 5.4 Mini, V8, and Covo Audio

GPT 5.4 Mini and Nano are shifting from hype to actual workflows, powering routing, content ops, and tightly scoped creative sub agents that package work without hitting publish. Midjourney V8 levels up speed and text-in-image but introduces “confident compliance” risks as almost-right visuals slip past tired reviewers. Tencent’s Covo Audio pushes open voice models toward real-time agents while raising serious questions about brand voice cloning, governance, and disclosure. Expect more value for creative leaders, brand guardians, and marketing systems builders while low-opinion first draft work gets automated away. The through line is human plus machine collaboration with strict guardrails and a ruthless taste filter.

Mar 19, 2026

144

Open Brains, Agent Teams, and the AI CMO Dream

Mistral Small 4 open weights, Okara’s AI CMO agent team, and Ollama’s Kimi K2.5 tool calling all point to the same shift. AI is moving from drafting content to actually operating workflows. This episode breaks down when to own the brain with open weights and when to rent it with hosted agents, plus what it really takes to self host without chaos. Learn where agent teams shine, where they fail on taste and truth, how to set reliability bars for tool calling, and which marketing workflows to automate first without torching your brand.

Mar 18, 2026

143

Google’s Multimodal Brain Meets Open Source Helios Video Chaos

Google quietly turned the internet into one multimodal brain with Gemini Embedding 2 and unified embeddings across text, images, video, audio, and PDFs. This episode breaks down how that changes creative search, cross-media intent matching, and ad workflows. It digs into hybrid retrieval, vibe control, and why metadata and risk tiers matter when Google Ads auto-edits your campaigns with new voiceovers and creative enhancements. It also covers Helios as an open source long form video model, what real time generation actually means for teams, and how creators can build modular pipelines that keep human taste in control while automation handles the grind.

Mar 17, 2026

142

The Nicer The Chart, The Bigger The Lie

Claude is moving into Excel and PowerPoint, Llama 4 rumors are heating up, and OpenClaw style browser agents are creeping toward your daily workflow. Hunter and Riley dig into how native AI in Office could automate marketing recaps, build decks from live data, and still mislead you with perfectly wrong charts. They unpack why open weights plus long context may finally make open source models practical for real marketing teams, and where context rot kicks in. Then they break down browser agents that can actually drive Chrome, the security tradeoffs, and how to safely use draft mode and audit trails to keep human control in the loop.

Mar 16, 2026

141

Open Weights and Infinite Clips: Phi 4, Stability, Helios

Microsoft’s Phi 4 Reasoning Vision model, Stability’s upgraded text to image, and Helios style real time video are colliding into a new kind of content assembly line. This episode breaks down where multimodal reasoning actually beats human throughput in ad and landing page compliance, where it still fails on nuance, and when to self host open weights versus lean on frontier APIs. Learn how brand teams can shift from rewriting assets to designing policy packs, prompt libraries, and critic layers. Get practical workflows for accessibility checks, asset tagging, and rapid video iteration so automation handles the grind while humans own taste, judgment, and guardrails.

Mar 15, 2026

140

AI Actors, Sora References, and Claude Charts Walk Into a Brand

Sora References, Soul Cast by Higgsfield, and Claude’s new interactive charts are all pushing AI toward repeatable, production-grade workflows. This episode breaks down how to build a reusable brand universe in Sora without turning everything into the same corporate sitcom, plus how to pressure-test character consistency and spot drift before you scale a campaign. Then it covers Higgsfield’s AI actors and “exclusive rights” and what that means for IP risk, localization, and archetype overload. Finally it unpacks Claude visualizations and content QA pipelines so marketers can automate reporting and checks without turning dashboards into pretty lies.

Mar 14, 2026

139

Open Source Shock: Nemotron, Llama 4 Scout, and Hume TADA

Nemotron 3 Super, Llama 4 Scout, and Hume TADA are all pushing what open source AI can do for real workflows. This episode digs into when million plus token context actually beats smart retrieval and when it just becomes expensive procrastination. Hear how to test long context models so they do not just summarize nonsense. Learn why open weights do not equal safe ops plus the boring places data still leaks. Then dive into TADA and what zero hallucinations really means for AI voice, strict copy lock, and brand safety. Get practical ideas for modular stacks that mix big context, fast tools, and specialist audio.

Mar 13, 2026

138

Open Source Avengers: GLM-5, MiroThinker, and Fish Audio S2

GLM-5 goes open weights with a privacy-first spin using Trusted Execution Environments and suddenly sensitive marketing data can power real automation without living in random vendor APIs. MiroThinker pushes verification-centric agents that plan, execute, and then audit themselves to reduce silent multi-step failures in real workflows like landing page updates and research. Fish Audio S2 drops open expressive text to speech with emotional prompts and multi-speaker flows, which is huge for ads, localization, and character content. The conversation digs into tradeoffs, real failure modes, verification layers, and how to avoid automating your way into compliance nightmares while still building a small content studio in a box.

Mar 12, 2026

137

Open Source Audio Magic and Consistent AI Characters for Marketers

AI video and audio just picked up some serious new superpowers. Hear how Kling 3.0 makes character consistency real enough for UGC-style ad series without turning every cut into a glitchy horror film. Get the real use cases for HiAR long video and LTX 2.3 open weights plus the hidden costs of running models locally. Learn why Meta’s open source SAM Audio changes podcast cleanup, sonic branding, licensing, and ethics. Explore Ming Omni TTS for brand voice at scale and what interruption friendly, real time AI audio means for interactive ads. Everything is framed around workflows, human approval loops, and using modular media to ship faster.

Mar 11, 2026

136

GPT 5.4 Pro Vibes, Agent Chaos, and Open Source Tradeoffs

OpenAI’s latest frontier model drop, nicknamed GPT 5.4 Pro, brings million token context, faster inference, cheaper tool calls, better long context memory, and interruptible generation. The real story is what this means for automated workflows across research, copy, and campaigns. Hunter and Riley break down interruptible agents, safe computer use with permissions and logs, and when to trust agents with execution. They unpack Anthropic’s Claude Opus 4.6 security win with Mozilla, eval awareness drama, and why benchmarks are now mostly marketing. Finally, they dig into open source agent frameworks, what control and portability really buy you, and where these agents are ready versus still hilariously clumsy.

Mar 9, 2026

135

Excel Ghosts, Tiny Models, and Budget Video: Dentist Day AI Drilldown

Excel gets a brain and marketers suddenly have a translator between vibes and finance logic. This episode digs into how the new ChatGPT Excel add in could reshape budget planning, incrementality testing, and data governance without turning spreadsheets into haunted houses. Then it unpacks where compact models like Phi 4 Reasoning Vision 15B actually win for creative QA and brand checks. Finally it breaks down Seedance 2.0 video pricing, how to use cheap AI video safely for concepting and testing, and why IP policies, queues, and clean asset libraries matter for any modern creative pipeline.

Mar 7, 2026

134

Helios, Higgsfield and Cuttlekit Open Source Chaos for Creators

Helios, Higgsfield and Cuttlekit stack into a spicy automation workflow for modern creators. Hunter and Riley unpack how Helios, an open source video model, could replace rough animatics with script to screen drafts while still needing serious guardrails on brand accuracy and compute. They dive into Higgsfield’s built in voice cloning and what real consent, governance and disclosure should look like when multilingual UGC ads become one click easy. Then they explore Cuttlekit’s generative UI that spins up ephemeral HTML tools on demand, plus why stable schemas and contract layers matter so it feels like a control room, not a haunted house.

Mar 6, 2026

133

Tiny Qwen, Big Plays and Agentic Marketing Mayhem

Qwen’s new compact open models, Higgsfield Audio, and Salesforce’s agentic marketing all point to one thing: workflows are getting weirdly powerful. This episode breaks down where tiny open Qwen models actually shine in marketing work, where they fall apart, and how to run them without turning into an infra team. Then it dives into Higgsfield Audio’s script to voice to translation and lip sync pipeline, what makes localized video feel uncanny, and how brands should handle rights and consent. Finally, it unpacks Salesforce’s new agentic marketing features, where autonomous optimization helps, where it flattens your brand into beige, and the guardrails teams need.

Mar 5, 2026

132

Fun-CosyVoice, Sonic Identity, and Agents in Hoodies

Audio is getting weird in a good way. This episode breaks down Alibaba Tongyi Lab’s open source Fun-CosyVoice 3.5 and Fun-AudioGen-VD, why natural language control for voice matters, and how to turn soundscapes into a real sonic identity instead of generic futuristic whooshes. Then the focus shifts to OpenAI’s Responses API and Agent SDK, what actually makes an agent different from workflow automation, and how to keep multi agent chaos in check. Finally, there is a look at Claude outages, memory upgrades, and how brand teams should think about dependency, fallbacks, curated memory, and the very real risks of beige audio and over trusted assistants.

Mar 4, 2026

131

Gemini 3, GPT 5.3, and Kling 3.0: Workflow or Hype Show

Google drops Gemini 3, OpenAI teases GPT 5.3 Codex and GPT 5 Mini, and ByteDance levels up Seedream 5.0 and Kling 3.0. This episode breaks down what actually changes for creative workflows. Learn how multimodal models can review your ads, why portability of your brand brain matters, and how to treat small models as bouncers not headliners. Get practical patterns for using faster code models to automate glue work, set failure budgets for agents, and build AI video style guides that protect brand trust instead of feeding style gravity and deepfake paranoia.

Mar 3, 2026

Open Source Vibe Check with VibeVoice and MOSS Audio

Open Up: Nemotron, LLM jp 4, and Laguna

OpenAI GPT 5.5 Ships Quietly, Workflows Loudly

Audio Flamingo Next and the Rise of Specialist AI

Microsoft Foundry Gets Voice, Images, and Transcripts

Closed Doors and Voice Chords with Claude and Gemini

Open Source, Open Questions: MiniMax, Orion1, LTX 2.3

Meta’s Muse Spark and the Closed AI Workflow Play

OpenClaw and Open Models: Build Your Own AI Stack?

Seedance in the Fast Lane, Plus Happy Horse and Music 2.6

Veo Goes Wide, ElevenLabs Locks Down, Firefly Cleans Up

Open Source Goes Long with GLM 5.1

Mythos, Meta, and CREATUS Walk Into a Workflow

Netflix VOID Fills the Gap in Post

OpenAI Goes Full Operator While Veo 3 Joins the Workflow

Gemini Talks, Shopify Sells, Runway Rolls Camera

Microsoft MAI Models: Transcribe First, Then Scale

Gemma 4 Goes Open While AI Marketers Get Weird

Open Source Roars as AI Video Gets Pipeline Ready

Open Voice, Multi Shot, and Google’s AI Music Push

Open Qwen, Closed Loop: Multimodal Gets Real

OpenClaw or Open Chaos? The Open Source Agent Reality

Gemini Flash Live and the Great AI Workflow Reality Check

Open Mic Night for AI: Covo, Cohere, and NotebookLM

Spud, Mythos, and the Rise of AI Campaign Operators

Open Mic Night: Lyria, PrismAudio, and Mistral

Claude Clicks, Mistral Opens, and AI Gets to Work

Luma UNI1, Pika 2.2, and ElevenLabs Raise the Floor

MiniMaxed Out: Open Weights, Agent Teams, and AI Ads

Open Source Gets Real with LTX 2.3, Rakuten, and Kitten

Open Source or Closed? AI Workflow Winners This Week

Open Nemotron and the Rise of Fast, Cheap AI Creation

Open Source Ears, Real Time Eyes

Open, Fast, Loud: Runway, SkyReels, and QuarkAudio

Are Agents Growing Up or Just Getting Louder

Sub Agents and Safe Chaos in GPT 5.4 Mini, V8, and Covo Audio

Open Brains, Agent Teams, and the AI CMO Dream

Google’s Multimodal Brain Meets Open Source Helios Video Chaos

The Nicer The Chart, The Bigger The Lie

Open Weights and Infinite Clips: Phi 4, Stability, Helios

AI Actors, Sora References, and Claude Charts Walk Into a Brand

Open Source Shock: Nemotron, Llama 4 Scout, and Hume TADA

Open Source Avengers: GLM-5, MiroThinker, and Fish Audio S2

Open Source Audio Magic and Consistent AI Characters for Marketers

GPT 5.4 Pro Vibes, Agent Chaos, and Open Source Tradeoffs

Excel Ghosts, Tiny Models, and Budget Video: Dentist Day AI Drilldown

Helios, Higgsfield and Cuttlekit Open Source Chaos for Creators

Tiny Qwen, Big Plays and Agentic Marketing Mayhem

Fun-CosyVoice, Sonic Identity, and Agents in Hoodies

Gemini 3, GPT 5.3, and Kling 3.0: Workflow or Hype Show

Authentication Required