EPISODE · Jun 24, 2026 · 11 MIN
Fugu: NEW Japanese AI DESTROYS Fable 5?
from AI News Today | Julian Goldie Podcast · host Julian Goldie
Fugu Ultra vs Fable 5 & Fusion: New Multi-Agent Panel Model Benchmarks (GoldyBench)The video reviews Sakana.ai’s newly released Fugu Ultra, a multi-agent panel API that runs prompts across multiple models in parallel and fuses results via a judge, aiming for Fable 5-level outputs comparable to Fable and Mythos. The presenter shows one-shot examples (websites, animated galaxy, inner solar system), explains integration into their Agent OS alongside Fusion, and compares speed and reliability: Fugu is faster for latency while Fugu Ultra is much slower but optimized for benchmark performance. Benchmarks shown indicate Sakana outperforming Fable on several tests (e.g., Terminal Bench 2.1, GPQA Diamond, Live Code Bench), and it beat Opus 4.8 on most creations except a voxel game that failed due to 16K token truncation after a long wait. They also note strict API limits that can block usage for hours, regional access issues, and recommend using multiple models in an agentic OS for parallel work and fallbacks, with Fusion remaining top for their outputs.00:00 Fugu Ultra Arrives00:32 One Shot Demos01:19 How Multi Agent Fusion Works02:13 Agent OS Integration03:16 Benchmarks And Strengths03:57 Tradeoffs Limits And Truncation05:18 Leaderboard And Fusion Examples05:59 Speed Vs Smooth Workflow07:22 Pricing Access And Availability09:16 Best Setup And Wrap Up10:21 Community And Final Outro
What this episode covers
Fugu Ultra vs Fable 5 & Fusion: New Multi-Agent Panel Model Benchmarks (GoldyBench)The video reviews Sakana.ai’s newly released Fugu Ultra, a multi-agent panel API that runs prompts across multiple models in parallel and fuses results via a judge, aiming for Fable 5-level outputs comparable to Fable and Mythos. The presenter shows one-shot examples (websites, animated galaxy, inner solar system), explains integration into their Agent OS alongside Fusion, and compares speed and reliability: Fugu is faster for latency while Fugu Ultra is much slower but optimized for benchmark performance. Benchmarks shown indicate Sakana outperforming Fable on several tests (e.g., Terminal Bench 2.1, GPQA Diamond, Live Code Bench), and it beat Opus 4.8 on most creations except a voxel game that failed due to 16K token truncation after a long wait. They also note strict API limits that can block usage for hours, regional access issues, and recommend using multiple models in an agentic OS for parallel work and fallbacks, with Fusion remaining top for their outputs.00:00 Fugu Ultra Arrives00:32 One Shot Demos01:19 How Multi Agent Fusion Works02:13 Agent OS Integration03:16 Benchmarks And Strengths03:57 Tradeoffs Limits And Truncation05:18 Leaderboard And Fusion Examples05:59 Speed Vs Smooth Workflow07:22 Pricing Access And Availability09:16 Best Setup And Wrap Up10:21 Community And Final Outro
NOW PLAYING
Fugu: NEW Japanese AI DESTROYS Fable 5?
No transcript for this episode yet
Similar Episodes
Mar 26, 2026 ·1m
Jan 2, 2026 ·47m
Dec 21, 2025 ·46m