COEY Cast

PODCAST · business

COEY Cast

COEY Cast is your daily download on AI and automation. We break down the latest in generative models, intelligent workflows, and emerging tools—giving marketers, operators, and business leaders the insights they need to move faster and scale smarter. From AI video and audio to end-to-end automation pipelines, each episode turns complex breakthroughs into clear, actionable takeaways you can actually use.

  1. 180

    Open Source Vibe Check with VibeVoice and MOSS Audio

    Microsoft's open source VibeVoice puts real pressure on audio workflows with multilingual transcription, speaker tracking, timestamps, and long context that can turn recordings into searchable assets. MOSS Audio adds a broader layer of audio understanding with emotion cues, music recognition, sound events, and time aware analysis that could help media teams mine podcasts, calls, ads, and live recordings for actual insight. Then Eva Brain enters with a bigger question for marketers: which parts of campaign management can agents really handle, and where do humans still need to lead? The bigger takeaway is simple. The model matters, but the workflow matters more when teams want automation that is useful, reliable, and still grounded in human judgment.

  2. 179

    Open Up: Nemotron, LLM jp 4, and Laguna

    Open models are having a real moment, and this trio shows why. NVIDIA Nemotron 3 Nano Omni points to simpler multimodal workflows by handling text, image, audio, and video in one stack. LLM jp 4 shows how regional open models can beat bigger global names when language, culture, and local context actually matter. Poolside Laguna brings the coding angle, but the bigger story is automation infrastructure for marketing teams that need custom tools, connectors, and internal workflows. The takeaway is practical: open can mean more control, flexibility, and lower lock in, but it also means more responsibility. Better systems win, especially when humans stay in the loop where judgment and brand risk matter most.

  3. 178

    OpenAI GPT 5.5 Ships Quietly, Workflows Loudly

    OpenAI dropped GPT 5.5 into the API with a huge context window, stronger reasoning, and deeper tool use, and the bigger story is how fast teams can put it to work. This covers why quiet launches matter more than flashy keynotes when marketers, creators, and operators need real workflow gains. It also digs into where automation actually helps first, from research briefs and call note synthesis to support flows with clean guardrails. ElevenLabs adds voice agent templates that make testing easier, while MiniMax Music 2.6 lowers the cost of experimenting with AI audio. The throughline is simple: AI is getting less performative and more operational, and the winners will be teams that ship practical systems with humans still making the calls.

  4. 177

    Audio Flamingo Next and the Rise of Specialist AI

    AI is getting less monolithic and more specialized, and that shift matters for anyone building real workflows. OpenAI’s GPT Rosalind signals that domain specific models are becoming a serious enterprise play. Higgsfield’s sci fi pilot shows AI video is pushing past flashy clips into longer form storytelling and faster pre production. NVIDIA’s open Audio Flamingo Next points to practical wins in podcast mining, searchable archives, call review, and media repurposing. The throughline is simple. General models still help orchestrate the stack, but specialist systems are where trust, depth, and format specific performance start to matter most. The real advantage comes from designing around recurring jobs, not chasing every shiny model release.

  5. 176

    Microsoft Foundry Gets Voice, Images, and Transcripts

    Microsoft just bundled MAI Transcribe 1, MAI Voice 1, and MAI Image 2 into Foundry, giving teams one place to handle transcription, synthetic voice, and image generation inside enterprise workflows. That sounds convenient, and it is, but it also raises the classic question of speed versus lock in. The conversation also digs into Audio Omni and why unified audio models could become real creative partners for editing, localization, sound design, and campaign iteration. Then it shifts to the less flashy but more important layer of AI adoption: rights, provenance, royalties, and governance. The real advantage is not stacking more models. It is building workflows that stay modular, accountable, and useful when real teams have to ship.

  6. 175

    Closed Doors and Voice Chords with Claude and Gemini

    Claude Opus 4.7, Gemini 3.1 Flash TTS, and GPT 5.4 Cyber all point to the same shift: AI tools are getting shaped around real jobs, not just flashy demos. This conversation breaks down what better instruction following, vision handling, task controls, and voice direction actually mean for creative ops, marketing teams, and automated content pipelines. The bigger takeaway is that stronger models do not fix weak workflows. Teams still need clear approvals, human judgment, and guardrails that keep fast production from turning into fast chaos. Open versus closed also stays in play as companies weigh convenience, privacy, portability, and control while building model stacks that fit how work actually gets done.

  7. 174

    Open Source, Open Questions: MiniMax, Orion1, LTX 2.3

    Open models are growing up fast, and that changes how teams should build with AI. MiniMax 2.7 puts fresh pressure on closed platforms with strong coding, long context, and agent potential, but the real story is workflow portability and avoiding vendor lock in. Orion1 pushes multilingual speech recognition forward with better support for lower resource languages, opening new possibilities for transcription, localization, and audience reach. OpenArt LTX 2.3 makes open video more practical for social content with better prompt adherence, smoother motion, audio sync, and portrait output. The bigger takeaway is simple. Date the model, marry the workflow, and keep humans in the loop where judgment still matters most.

  8. 173

    Meta’s Muse Spark and the Closed AI Workflow Play

    Meta’s Muse Spark is the clearest sign that AI is shifting from flashy demos into real workflow. The bigger move is not just the model itself, but the way Meta dropped it straight into the apps people already use all day. That makes Muse Spark immediately relevant for marketers and creators working across Facebook, Instagram, Messenger, WhatsApp, and more. The conversation also tracks why Google DeepMind’s Lyria 3 Pro matters for production ready audio, and why Wan 2.7 is getting attention for practical video controls like motion transfer and multi image guidance. The common thread is simple. AI is getting less theatrical and more useful for teams trying to ship content faster with humans still making the calls.

  9. 172

    OpenClaw and Open Models: Build Your Own AI Stack?

    Open source AI is moving from hobbyist flex to serious business infrastructure. DeepSeek's open models, OpenClaw, and new voice systems are pushing teams to ask a bigger question: should you own your AI stack instead of renting one through a closed API? This conversation breaks down what that shift actually means for marketers, creators, and operators. Private deployment, multilingual generation, branded voice workflows, and multi agent automation all sound great until governance, security, evals, and process design show up. The real opportunity is not just cheaper models. It's more control, better customization, and workflows built around your team. The real risk is automating chaos faster if humans are not steering the system.

  10. 171

    Seedance in the Fast Lane, Plus Happy Horse and Music 2.6

    ByteDance's Seedance 2.0 is making AI video feel ready for real production, not just flashy demos. The big shift is workflow. Teams can move from concept to scene direction, motion, and synced audio in one place with fewer handoffs and fewer weird artifacts. That changes how marketers, creators, and media teams think about speed, approvals, and risk. The conversation also looks at Alibaba's Happy Horse as a stealth video contender, why open versus closed systems still matters for automation strategy, and how MiniMax Music 2.6 is pushing audio AI toward editing, continuation, and usable post production. The takeaway is simple. Better models matter, but better workflows are what actually ship.

  11. 170

    Veo Goes Wide, ElevenLabs Locks Down, Firefly Cleans Up

    Google is opening Veo 3 to more teams and introducing Veo 3.1 Light as a lower cost path to cinematic video generation. That matters less for pure wow factor and more for iteration, pre production, and testing creative directions before a full shoot. ElevenLabs is pushing voice AI deeper into enterprise with on prem and on device options that make privacy, compliance, and localization more realistic for serious buyers. Adobe Firefly rounds out the story with tools built for control, brand safety, and high volume asset production. The bigger theme is simple. AI wins when it fits the workflow. Better models help, but judgment, review, transparency, and usable handoffs still decide what actually ships.

  12. 169

    Open Source Goes Long with GLM 5.1

    Z.ai's open source GLM 5.1 is pushing the AI conversation past chat and straight into workflow automation. The model promises long horizon autonomy, tool use, structured outputs, and the kind of persistence that could make open models far more useful for creators, marketers, and operators. That does not mean teams should hand over the keys. The real opportunity is using strong open models for recoverable, high volume tasks while keeping humans at critical decision points. The conversation also digs into Anthropic's Claude Managed Agents and Google's Lyria 3 to show where agent infrastructure and AI audio are getting more practical, and where taste, oversight, and process design still matter most for getting real work done.

  13. 168

    Mythos, Meta, and CREATUS Walk Into a Workflow

    Anthropic’s Claude Mythos Preview arrives as a locked down frontier model, and the restrictions are almost as important as the benchmarks. That raises a bigger question about trust, safety, and whether teams should bet on closed systems or wait for open alternatives. Meta keeps pushing ad automation with Andromeda and Advantage+, shifting media buying away from manual controls and toward creative volume, structured variation, and stronger human judgment. CREATUS.AI brings a more practical video update with better motion, lip sync, longer clips, and audio driven generation that could actually speed up production. The throughline is simple: AI is getting more useful when it fits real workflows, not just flashy demos.

  14. 167

    Netflix VOID Fills the Gap in Post

    Netflix just open sourced VOID, a video inpainting model built for a very real production problem: fixing footage that is almost usable. The conversation digs into how VOID removes people or objects and rebuilds motion, shadows, and scene logic so shots feel physically believable instead of obviously patched. That makes it a serious tool for editors, agencies, and brand teams trying to avoid reshoots, roto work, and cleanup chaos. The bigger takeaway goes beyond one model. AI video is shifting from flashy demo culture to workflow utility, where consistency, speed, and cost matter more than cinematic bragging rights. There is also a real trust question here, because better cleanup tools make provenance, disclosure, and human review a lot more important.

  15. 166

    OpenAI Goes Full Operator While Veo 3 Joins the Workflow

    OpenAI rumors are pointing toward bigger context windows, stronger computer use, and a real shift from chatbot novelty to agent execution. That matters less for benchmarks and more for actual work like research, planning, browser tasks, reporting, and handoffs. Google Veo 3 is also pushing video generation closer to business workflows through Google Vids, which could change how teams make explainers, ads, and internal content. On the audio side, ElevenLabs and Suno are making voice and music creation more practical for brands. Open source options like OpenClaw add another layer by giving teams more control, privacy, and flexibility. The real advantage is not model access. It's having strong systems, clear taste, and humans staying in the loop.

  16. 165

    Gemini Talks, Shopify Sells, Runway Rolls Camera

    Google’s Gemini 3.1 Flash Live pushes voice from demo mode toward real workflow use, with faster conversation, better turn taking, and stronger multimodal context. Shopify’s agentic storefront shift means product catalogs now need to work for AI shopping assistants, not just human buyers, making clean metadata and structured commerce a real advantage. Runway Gen 4.5 keeps moving AI video closer to production with better multi shot consistency, native audio momentum, and faster iteration for creative teams. The bigger theme is simple: voice, shopping, and video are all becoming operational systems. The winners will not be the teams chasing hype. They will be the ones pairing strong creative judgment with clean automation, better data, and human oversight where it counts most.

  17. 164

    Microsoft MAI Models: Transcribe First, Then Scale

    Microsoft just dropped three new Azure AI Foundry models, and the big takeaway is simple: transcription may be the real winner. MAI Transcribe 1, MAI Voice 1, and MAI Image 2 signal a bigger shift toward treating language, voice, and visuals as workflow infrastructure instead of novelty features. The conversation breaks down why transcription creates the fastest payoff for creators and marketers, where synthetic voice actually helps, and why better image text rendering matters for ads, mockups, and branded assets. It also looks at the tradeoffs between polished closed platforms and open source options like OpenClaw, with a practical case for hybrid stacks that keep humans focused on judgment while automation handles the repetitive middle.

  18. 163

    Gemma 4 Goes Open While AI Marketers Get Weird

    Google DeepMind just made Gemma 4 a real conversation by releasing it under Apache 2.0, and that changes how teams think about building with open models. This covers what Gemma 4 means for commercial use, local deployment, private workflows, and where smaller controllable models actually fit inside a production stack. It also gets into the Claude Code source exposure and why the bigger lesson is not drama but operational discipline. Then the focus shifts to the rise of the so called AI marketer, where agents can research, draft, monitor, and optimize but still cannot replace strategy, taste, or accountability. The real shift is from flashy copilots to systems that can carry work across steps.

  19. 162

    Open Source Roars as AI Video Gets Pipeline Ready

    AI video is getting close to real campaign use, but the bigger shift is what that does to creative workflows. Higgsfield Cinema Studio 3.0 pushes better character consistency and multi shot scenes, while Google Veo 3.1 Lite makes video generation cheaper, faster, and easier to plug into production systems. On the audio side, open source LongCat AudioDiT points to a future where voice becomes core infrastructure for localization, dubbing, and content scaling. The real advantage is not just prettier outputs. It is faster concepting, tighter feedback loops, stronger review systems, and human judgment baked into every step. Pretty clips are easy now. Useful, persuasive, on brand content still needs people in the loop.

  20. 161

    Open Voice, Multi Shot, and Google’s AI Music Push

    Google’s Lyria 3 Pro, Runway’s Multi Shot App, and Mistral’s open weights text to speech model all point to the same shift. Audio, video, and voice are becoming programmable workflow layers for creators and marketers. That opens the door to faster campaign concepts, localized narration, branded audio, and more efficient content production. It also raises bigger questions around taste, governance, approvals, and whether teams are making better work or just more of it. The real advantage is not having one more AI toy. It is building a stack that supports strategy, review, and brand consistency while keeping humans in the loop where judgment still matters most.

  21. 160

    Open Qwen, Closed Loop: Multimodal Gets Real

    Alibaba’s open Qwen 3.5 Omni is pushing multimodal AI past flashy demos and closer to real workflow value. Voice, camera input, long audio context, and fast generation are starting to look less like chatbot features and more like a new interface for building drafts, prototypes, and internal tools. The bigger question is where this actually works for teams with approvals, brand rules, and security needs. The conversation also maps the rise of practical AI video through Kling 3.0, Dreamina, and Seedance 2.0, plus why Intercom’s Fin Apex 1.0 may be the clearest sign of how enterprises will really buy AI. The takeaway is simple. Route the right work to the right model and keep humans on taste, trust, and decisions.

  22. 159

    OpenClaw or Open Chaos? The Open Source Agent Reality

    OpenClaw is getting hyped as an open source agent framework that can handle content, scheduling, asset creation, memory, and workflow coordination for lean teams. The real story is less about replacing your whole marketing team and more about building systems that can manage repeatable work without creating automated chaos. Persistent context, reusable templates, Telegram based coordination, and local model setups all sound powerful, but they still need strong briefs, clear approvals, and actual human judgment. The payoff is faster operations for monitoring, triage, summaries, and first drafts. The risk is scaling bland output or messy workflows. Automation can remove the boring middle, but brand voice, taste, and strategy still need humans in the loop.

  23. 158

    Gemini Flash Live and the Great AI Workflow Reality Check

    Google is pushing Gemini 3.1 Flash Live into real time voice and camera workflows, and that makes one thing clear. Voice AI is becoming a real interface layer for brands, not just a flashy demo. The bigger question is where it actually works. Customer triage, guided commerce, multilingual support, and structured actions look promising. Emotional nuance, messy edge cases, and brand risk still need people. The conversation also turns to Z.ai's GLM 5.1 and why lower cost models are putting real pressure on premium AI pricing. Add Snapchat and Google building generative tools deeper into ad platforms, and the shift is obvious. AI is moving from magic trick to workflow infrastructure, with humans still steering the ship.

  24. 157

    Open Mic Night for AI: Covo, Cohere, and NotebookLM

    Audio just stopped being a side feature and started looking like core workflow infrastructure. This conversation tracks three big signals behind that shift. Tencent’s open Covo Audio pushes toward more natural voice interaction with lower latency and better interruption handling. Cohere’s open speech recognition model could unlock cheaper, faster transcription for meetings, podcasts, support, and multilingual operations. NotebookLM is also stretching beyond research and into narrated video creation, collapsing steps that used to live across multiple tools. The real question is not which demo looks coolest. It is where automation actually removes friction while keeping humans close to judgment, brand, accuracy, and risk. That is where creators, marketers, and operators get real leverage.

  25. 156

    Spud, Mythos, and the Rise of AI Campaign Operators

    Leaked frontier model chatter is loud, but the bigger story is workflow. Anthropic's rumored Claude Mythos, also called Capybara in some corners, and OpenAI's rumored Spud show how fast the model race keeps shifting. The real takeaway for creators and marketers is not to rebuild around leaks or wait for the next flagship. It is to build model agnostic systems with evals, guardrails, and human review. Klaviyo Composer, Pomelli, and RogIQ show where things are heading as AI moves from chat assistant to campaign operator. That shift can remove the boring middle of briefs, asset versions, and reporting, but taste, judgment, and brand differentiation still need a human hand on the wheel.

  26. 155

    Open Mic Night: Lyria, PrismAudio, and Mistral

    Audio just jumped from nice to have to workflow priority. Google’s Lyria 3 Pro pushes AI music closer to usable campaign assets with longer, more structured tracks that fit real production needs. Open source PrismAudio tackles one of post production’s most annoying problems by matching sound effects and environmental audio to what is actually happening on screen. Mistral adds another important signal with open speech, giving teams more control over voice pipelines, localization, and costs. The bigger story is not just better demos. It is how brands, creators, and media teams build smarter audio systems that save time while keeping humans in charge of taste, trust, approvals, and final creative judgment.

  27. 154

    Claude Clicks, Mistral Opens, and AI Gets to Work

    Claude is moving from chatbot to operator, and that changes the automation conversation fast. This covers what Anthropic's computer use push really means for teams, where AI agents can save time today, and why brittle workflows still break the fantasy of full autonomy. It also digs into Microsoft's MAI Image 2 and why better text rendering matters more than flashy demos for marketers who need usable creative assets. Then it zooms out to Mistral's open weight momentum, why open models matter for control and multilingual workflows, and where the tradeoffs get very real. The through line is simple: machine action is getting better, but smart workflow design and human judgment still decide whether automation creates leverage or chaos.

  28. 153

    Luma UNI1, Pika 2.2, and ElevenLabs Raise the Floor

    Luma UNI1, Pika 2.2, and ElevenLabs all point to the same shift: polished creative output is getting faster, cheaper, and easier to slot into real workflows. Better image reasoning means fewer prompt gymnastics and more usable first drafts. Stronger short form video generation makes social production more reliable at the speed marketing teams actually need. ElevenLabs is pushing beyond voice tools into marketplace infrastructure, which puts creator payouts, licensing, and trust at the center of the audio conversation. The real story is not magic buttons. It is what happens when content generation gets easy and the bottleneck moves to taste, approvals, governance, and process. The winners will be teams with cleaner systems and better human judgment.

  29. 152

    MiniMaxed Out: Open Weights, Agent Teams, and AI Ads

    Open weights are getting real, agent teams are getting practical, and generative interfaces are starting to shape ad distribution. This conversation tracks why MiniMax M2.7 matters beyond benchmark hype, especially for teams that want more control over internal workflows without betting everything on one closed vendor. It also breaks down where multi agent coding systems like Codex can actually help and where they just create expensive digital meetings. The bigger shift is what happens when the same AI interface that helps create content also decides what gets surfaced to users. For creators, marketers, and media operators, the advantage comes from building structured workflows, keeping humans near judgment calls, and staying machine readable without becoming generic.

  30. 151

    Open Source Gets Real with LTX 2.3, Rakuten, and Kitten

    Open source AI had a very loud week. LTX 2.3 moved closer to real production use with an API for image to video and native portrait output, which matters a lot for social teams and ad workflows. Rakuten AI 3.0 added fuel to the regional model conversation with a large Japanese release that raises useful questions about localization, transparency, and what counts as real innovation. Kitten TTS showed how small voice models can bring text to speech to browsers, CPUs, and lower cost products. The bigger takeaway is simple. Better models are not the whole game. Workflow design, human review, and operational sanity are what turn open tools into something a team can actually use.

  31. 150

    Open Source or Closed? AI Workflow Winners This Week

    Google DeepMind is testing Deep Think in Gemini 2.5 Pro, Ollama 0.7 makes OpenClaw easier to run locally, and Typeface is pushing deeper into governed marketing orchestration. The bigger story is not which demo looks smartest. It is which setup actually helps teams ship better work with fewer surprises. This conversation breaks down where stronger reasoning helps, where local open source stacks save money but add ops overhead, and where commercial platforms earn their keep with approvals, compliance, and brand control. For creators, marketers, and media operators, the takeaway is simple: automate prep, QA, routing, and research, but keep humans close to final judgment, sensitive messaging, and anything that can create brand risk.

  32. 149

    Open Nemotron and the Rise of Fast, Cheap AI Creation

    NVIDIA’s open Nemotron models are pushing a big shift in AI workflows by making stronger reasoning and agent behavior more affordable. That matters for teams building automations for briefs, reporting, planning, and content operations without burning budget on every step. At the same time, Dreamverse is showing how ultra fast video generation changes creative work from prompt and wait to steer and refine. Rebel Audio adds another piece to the stack by lowering the friction for smaller teams that want useful audio without heavy production overhead. The bigger story is not unlimited automation. It is how open models, faster media tools, and lighter production pipelines raise the value of taste, review, governance, and human judgment across modern creator and marketing workflows.

  33. 148

    Open Source Ears, Real Time Eyes

    Runway is pushing AI video toward real time creation, with reported sub 100 millisecond response that could turn generation from a waiting game into a live creative tool. SkyReels V4 shows a different shift, where video models start looking more like usable software with benchmarks, pricing, multimodal inputs, and native audio. QuarkAudio adds the open source angle, pointing to a future where audio cleanup, separation, and voice tasks get less fragmented and more flexible. The bigger takeaway is not full autonomy. It is modular workflow design. Faster models move the bottleneck from rendering to judgment, approvals, brand safety, and taste. Human direction still matters most when automation makes endless iteration cheap.

  34. 147

    Open, Fast, Loud: Runway, SkyReels, and QuarkAudio

    Runway just showed real time video generation with sub 100 millisecond first frames, and that changes how creative teams iterate. Video starts to feel less like rendering and more like steering. SkyReels V4 pushes the conversation further with multimodal video, native audio, and production minded packaging that looks more like software than a research demo. Then Alibaba’s open source QuarkAudio brings a practical shift to audio workflows with one model aimed at cleanup, separation, and voice tasks. The real takeaway is not full autonomy. It is modular automation. Faster tools move the bottleneck from rendering to judgment, approvals, brand safety, and taste. The winners will be teams that pair speed with strong systems and humans who know what good looks like.

  35. 146

    Are Agents Growing Up or Just Getting Louder

    Agents are leveling up, but are they actually ready to own parts of your workflow yet. This episode breaks down MiniMax M2.7, Moonshot Kimi K2.5, and Langflow 1.8 with IBM Agentics through the lens of real marketing and creator use cases. Learn what self improving agents should actually handle, how to treat long context as a bigger desk not a better brain, and why map reduce generate patterns matter. Get a clear do not automate list for brand risk, compliance, and taste while still using agents to crush prep work, tagging, research, and daily briefs.

  36. 145

    Sub Agents and Safe Chaos in GPT 5.4 Mini, V8, and Covo Audio

    GPT 5.4 Mini and Nano are shifting from hype to actual workflows, powering routing, content ops, and tightly scoped creative sub agents that package work without hitting publish. Midjourney V8 levels up speed and text-in-image but introduces “confident compliance” risks as almost-right visuals slip past tired reviewers. Tencent’s Covo Audio pushes open voice models toward real-time agents while raising serious questions about brand voice cloning, governance, and disclosure. Expect more value for creative leaders, brand guardians, and marketing systems builders while low-opinion first draft work gets automated away. The through line is human plus machine collaboration with strict guardrails and a ruthless taste filter.

  37. 144

    Open Brains, Agent Teams, and the AI CMO Dream

    Mistral Small 4 open weights, Okara’s AI CMO agent team, and Ollama’s Kimi K2.5 tool calling all point to the same shift. AI is moving from drafting content to actually operating workflows. This episode breaks down when to own the brain with open weights and when to rent it with hosted agents, plus what it really takes to self host without chaos. Learn where agent teams shine, where they fail on taste and truth, how to set reliability bars for tool calling, and which marketing workflows to automate first without torching your brand.

  38. 143

    Google’s Multimodal Brain Meets Open Source Helios Video Chaos

    Google quietly turned the internet into one multimodal brain with Gemini Embedding 2 and unified embeddings across text, images, video, audio, and PDFs. This episode breaks down how that changes creative search, cross-media intent matching, and ad workflows. It digs into hybrid retrieval, vibe control, and why metadata and risk tiers matter when Google Ads auto-edits your campaigns with new voiceovers and creative enhancements. It also covers Helios as an open source long form video model, what real time generation actually means for teams, and how creators can build modular pipelines that keep human taste in control while automation handles the grind.

  39. 142

    The Nicer The Chart, The Bigger The Lie

    Claude is moving into Excel and PowerPoint, Llama 4 rumors are heating up, and OpenClaw style browser agents are creeping toward your daily workflow. Hunter and Riley dig into how native AI in Office could automate marketing recaps, build decks from live data, and still mislead you with perfectly wrong charts. They unpack why open weights plus long context may finally make open source models practical for real marketing teams, and where context rot kicks in. Then they break down browser agents that can actually drive Chrome, the security tradeoffs, and how to safely use draft mode and audit trails to keep human control in the loop.

  40. 141

    Open Weights and Infinite Clips: Phi 4, Stability, Helios

    Microsoft’s Phi 4 Reasoning Vision model, Stability’s upgraded text to image, and Helios style real time video are colliding into a new kind of content assembly line. This episode breaks down where multimodal reasoning actually beats human throughput in ad and landing page compliance, where it still fails on nuance, and when to self host open weights versus lean on frontier APIs. Learn how brand teams can shift from rewriting assets to designing policy packs, prompt libraries, and critic layers. Get practical workflows for accessibility checks, asset tagging, and rapid video iteration so automation handles the grind while humans own taste, judgment, and guardrails.

  41. 140

    AI Actors, Sora References, and Claude Charts Walk Into a Brand

    Sora References, Soul Cast by Higgsfield, and Claude’s new interactive charts are all pushing AI toward repeatable, production-grade workflows. This episode breaks down how to build a reusable brand universe in Sora without turning everything into the same corporate sitcom, plus how to pressure-test character consistency and spot drift before you scale a campaign. Then it covers Higgsfield’s AI actors and “exclusive rights” and what that means for IP risk, localization, and archetype overload. Finally it unpacks Claude visualizations and content QA pipelines so marketers can automate reporting and checks without turning dashboards into pretty lies.

  42. 139

    Open Source Shock: Nemotron, Llama 4 Scout, and Hume TADA

    Nemotron 3 Super, Llama 4 Scout, and Hume TADA are all pushing what open source AI can do for real workflows. This episode digs into when million plus token context actually beats smart retrieval and when it just becomes expensive procrastination. Hear how to test long context models so they do not just summarize nonsense. Learn why open weights do not equal safe ops plus the boring places data still leaks. Then dive into TADA and what zero hallucinations really means for AI voice, strict copy lock, and brand safety. Get practical ideas for modular stacks that mix big context, fast tools, and specialist audio.

  43. 138

    Open Source Avengers: GLM-5, MiroThinker, and Fish Audio S2

    GLM-5 goes open weights with a privacy-first spin using Trusted Execution Environments and suddenly sensitive marketing data can power real automation without living in random vendor APIs. MiroThinker pushes verification-centric agents that plan, execute, and then audit themselves to reduce silent multi-step failures in real workflows like landing page updates and research. Fish Audio S2 drops open expressive text to speech with emotional prompts and multi-speaker flows, which is huge for ads, localization, and character content. The conversation digs into tradeoffs, real failure modes, verification layers, and how to avoid automating your way into compliance nightmares while still building a small content studio in a box.

  44. 137

    Open Source Audio Magic and Consistent AI Characters for Marketers

    AI video and audio just picked up some serious new superpowers. Hear how Kling 3.0 makes character consistency real enough for UGC-style ad series without turning every cut into a glitchy horror film. Get the real use cases for HiAR long video and LTX 2.3 open weights plus the hidden costs of running models locally. Learn why Meta’s open source SAM Audio changes podcast cleanup, sonic branding, licensing, and ethics. Explore Ming Omni TTS for brand voice at scale and what interruption friendly, real time AI audio means for interactive ads. Everything is framed around workflows, human approval loops, and using modular media to ship faster.

  45. 136

    GPT 5.4 Pro Vibes, Agent Chaos, and Open Source Tradeoffs

    OpenAI’s latest frontier model drop, nicknamed GPT 5.4 Pro, brings million token context, faster inference, cheaper tool calls, better long context memory, and interruptible generation. The real story is what this means for automated workflows across research, copy, and campaigns. Hunter and Riley break down interruptible agents, safe computer use with permissions and logs, and when to trust agents with execution. They unpack Anthropic’s Claude Opus 4.6 security win with Mozilla, eval awareness drama, and why benchmarks are now mostly marketing. Finally, they dig into open source agent frameworks, what control and portability really buy you, and where these agents are ready versus still hilariously clumsy.

  46. 135

    Excel Ghosts, Tiny Models, and Budget Video: Dentist Day AI Drilldown

    Excel gets a brain and marketers suddenly have a translator between vibes and finance logic. This episode digs into how the new ChatGPT Excel add in could reshape budget planning, incrementality testing, and data governance without turning spreadsheets into haunted houses. Then it unpacks where compact models like Phi 4 Reasoning Vision 15B actually win for creative QA and brand checks. Finally it breaks down Seedance 2.0 video pricing, how to use cheap AI video safely for concepting and testing, and why IP policies, queues, and clean asset libraries matter for any modern creative pipeline.

  47. 134

    Helios, Higgsfield and Cuttlekit Open Source Chaos for Creators

    Helios, Higgsfield and Cuttlekit stack into a spicy automation workflow for modern creators. Hunter and Riley unpack how Helios, an open source video model, could replace rough animatics with script to screen drafts while still needing serious guardrails on brand accuracy and compute. They dive into Higgsfield’s built in voice cloning and what real consent, governance and disclosure should look like when multilingual UGC ads become one click easy. Then they explore Cuttlekit’s generative UI that spins up ephemeral HTML tools on demand, plus why stable schemas and contract layers matter so it feels like a control room, not a haunted house.

  48. 133

    Tiny Qwen, Big Plays and Agentic Marketing Mayhem

    Qwen’s new compact open models, Higgsfield Audio, and Salesforce’s agentic marketing all point to one thing: workflows are getting weirdly powerful. This episode breaks down where tiny open Qwen models actually shine in marketing work, where they fall apart, and how to run them without turning into an infra team. Then it dives into Higgsfield Audio’s script to voice to translation and lip sync pipeline, what makes localized video feel uncanny, and how brands should handle rights and consent. Finally, it unpacks Salesforce’s new agentic marketing features, where autonomous optimization helps, where it flattens your brand into beige, and the guardrails teams need.

  49. 132

    Fun-CosyVoice, Sonic Identity, and Agents in Hoodies

    Audio is getting weird in a good way. This episode breaks down Alibaba Tongyi Lab’s open source Fun-CosyVoice 3.5 and Fun-AudioGen-VD, why natural language control for voice matters, and how to turn soundscapes into a real sonic identity instead of generic futuristic whooshes. Then the focus shifts to OpenAI’s Responses API and Agent SDK, what actually makes an agent different from workflow automation, and how to keep multi agent chaos in check. Finally, there is a look at Claude outages, memory upgrades, and how brand teams should think about dependency, fallbacks, curated memory, and the very real risks of beige audio and over trusted assistants.

  50. 131

    Gemini 3, GPT 5.3, and Kling 3.0: Workflow or Hype Show

    Google drops Gemini 3, OpenAI teases GPT 5.3 Codex and GPT 5 Mini, and ByteDance levels up Seedream 5.0 and Kling 3.0. This episode breaks down what actually changes for creative workflows. Learn how multimodal models can review your ads, why portability of your brand brain matters, and how to treat small models as bouncers not headliners. Get practical patterns for using faster code models to automate glue work, set failure budgets for agents, and build AI video style guides that protect brand trust instead of feeding style gravity and deepfake paranoia.

Type above to search every episode's transcript for a word or phrase. Matches are scoped to this podcast.

Searching…

No matches for "" in this podcast's transcripts.

Showing of matches

No topics indexed yet for this podcast.

Loading reviews...

ABOUT THIS SHOW

COEY Cast is your daily download on AI and automation. We break down the latest in generative models, intelligent workflows, and emerging tools—giving marketers, operators, and business leaders the insights they need to move faster and scale smarter. From AI video and audio to end-to-end automation pipelines, each episode turns complex breakthroughs into clear, actionable takeaways you can actually use.

HOSTED BY

COEY

URL copied to clipboard!