EPISODE · May 29, 2026 · 12 MIN
Claude Opus 4.8 Just Changed AI Agents FOREVER!
from AI News Today | Julian Goldie Podcast · host Julian Goldie
Claude Opus 4.8: Dynamic Workflows, Hundreds of Agents, and the Benchmarks That MatterThe script explains Claude Opus 4.8’s release, highlighting “dynamic workflows” that let Claude spin up hundreds of coordinated agents in one session to plan, build, review, and self-check work, with agents able to run for days. It describes enabling this via Claude Code’s Ultra Code mode (and by asking Claude to create a dynamic workflow), and promotes an “agent operating system” inside the AI Profit Bot/AI Profit Boardroom with a zip file, tutorial, 30-day roadmap, prompts, weekly coaching calls, and a community. The video reviews benchmarks: SWE-bench Verified 88.6% and SWE-bench Pro 69.2% (ahead of GPT-5.5 and Gemini 3.1 Pro), Terminal Bench 74.6% (behind GPT-5.5), Frontier SWE rank #1, office-work GTP-Val where 4.8 beats GPT-5.5 in most matchups, improved Zapier workflow score, and a major long-context memory jump. It also notes reduced overclaiming, lower “sneaky” behavior, effort controls, faster/cheaper fast mode, and that Anthropic’s teased Mythos model may arrive soon.00:00 Opus 4.8 Biggest Upgrade00:52 Dynamic Workflows Explained01:25 Ultra Code One Click Setup01:51 Parallel Agent Speed Gains02:33 Bun Rebuild Case Study03:10 Agents Running For Days03:33 Agent OS And Offer04:26 Coding Benchmarks Breakdown06:08 Office Legal Workflow Tests07:17 Long Context Memory Leap08:22 Honesty And Safety Improvements09:15 Mythos Preview And Timeline10:02 Effort Controls Pricing Speed10:34 Wrap Up And Next Steps
What this episode covers
Claude Opus 4.8: Dynamic Workflows, Hundreds of Agents, and the Benchmarks That MatterThe script explains Claude Opus 4.8’s release, highlighting “dynamic workflows” that let Claude spin up hundreds of coordinated agents in one session to plan, build, review, and self-check work, with agents able to run for days. It describes enabling this via Claude Code’s Ultra Code mode (and by asking Claude to create a dynamic workflow), and promotes an “agent operating system” inside the AI Profit Bot/AI Profit Boardroom with a zip file, tutorial, 30-day roadmap, prompts, weekly coaching calls, and a community. The video reviews benchmarks: SWE-bench Verified 88.6% and SWE-bench Pro 69.2% (ahead of GPT-5.5 and Gemini 3.1 Pro), Terminal Bench 74.6% (behind GPT-5.5), Frontier SWE rank #1, office-work GTP-Val where 4.8 beats GPT-5.5 in most matchups, improved Zapier workflow score, and a major long-context memory jump. It also notes reduced overclaiming, lower “sneaky” behavior, effort controls, faster/cheaper fast mode, and that Anthropic’s teased Mythos model may arrive soon.00:00 Opus 4.8 Biggest Upgrade00:52 Dynamic Workflows Explained01:25 Ultra Code One Click Setup01:51 Parallel Agent Speed Gains02:33 Bun Rebuild Case Study03:10 Agents Running For Days03:33 Agent OS And Offer04:26 Coding Benchmarks Breakdown06:08 Office Legal Workflow Tests07:17 Long Context Memory Leap08:22 Honesty And Safety Improvements09:15 Mythos Preview And Timeline10:02 Effort Controls Pricing Speed10:34 Wrap Up And Next Steps
NOW PLAYING
Claude Opus 4.8 Just Changed AI Agents FOREVER!
No transcript for this episode yet
Similar Episodes
Mar 26, 2026 ·1m
Jan 2, 2026 ·47m
Dec 21, 2025 ·46m