EPISODE · Jun 15, 2026 · 9 MIN
NEW GLM 5.2 DESTROYS Claude?
from AI News Today | Julian Goldie Podcast · host Julian Goldie
GLM 5.2 vs Kimi K 2.7 vs Opus 4.8: Which AI Model Builds the Best Apps?The script compares GLM 5.2, Kimi K 2.7, and Opus 4.8 by giving them identical build prompts inside an agent operating system and judging results across five tests: a Temple Run–style voxel runner (GLM 5.2 best), an inner solar system/orbit HUD simulation (Kimi K 2.7 best for zoom, speed, and customization), a liquid-in-a-bowl particle/metaball interaction (GLM 5.2 best), an Apple-style AI model landing page (GLM 5.2 best), and a neon arcade game (GLM 5.2 most fun). The narrator notes GLM 5.2 is very new and not yet on OpenRouter, contrasts origins and context windows, highlights that Kimi and GLM can be used inside AI agents unlike Claude/Opus, and concludes GLM 5.2 wins four of five tests while promoting the AI Profit Boarding community and agent OS download.00:00 Model Showdown Setup00:57 Test 1 Temple Run Runner02:01 Test 2 Solar Orbit Map03:27 Test 3 Liquid Metaballs04:39 Test 4 Apple Style Landing Page05:40 Test 5 Neon Arcade Game06:24 Benchmarks And Model Specs07:51 How To Use Them Together08:15 Get The Agent OS09:30 Community Wrap UpFusion (OpenRouter) Lets You Combine Multiple Models to Reach Fable-Level Intelligence for LessThe script covers a new OpenRouter Fusion API update that runs a prompt across a parallel panel of up to eight models (with web search and bash tools), then uses a judge model to extract consensus, contradictions, unique insights, and missing coverage before returning one fused answer. Fusion is presented as a way to boost benchmark performance and reduce token costs versus relying on a single frontier model, with tests on 100 hard deep-research tasks showing much of the lift coming from synthesis rather than diversity. Examples compare solo models versus panels, including a “budget panel” of cheaper models landing within 1% of Claude Fable 5 on intelligence tests, and demonstrations of using Fusion in chat and via API to generate outputs like SEO research and a clean landing page.00:00 Fusion Update Overview00:51 Panels Beat Solo Models01:43 Budget Panel Near Fable02:23 How Fusion Works03:05 Live Panel Demo03:53 Benchmark Results Breakdown05:02 API Integration Ideas05:50 Boardroom SEO Example06:56 Judge Fusion Output08:05 Draco Benchmark Explained09:10 Landing Page Results10:28 Wrap Up And OffersFusion (OpenRouter) Lets You Combine Multiple Models to Reach Fable-Level Intelligence for LessThe script covers a new OpenRouter Fusion API update that runs a prompt across a parallel panel of up to eight models (with web search and bash tools), then uses a judge model to extract consensus, contradictions, unique insights, and missing coverage before returning one fused answer. Fusion is presented as a way to boost benchmark performance and reduce token costs versus relying on a single frontier model, with tests on 100 hard deep-research tasks showing much of the lift coming from synthesis rather than diversity. Examples compare solo models versus panels, including a “budget panel” of cheaper models landing within 1% of Claude Fable 5 on intelligence tests, and demonstrations of using Fusion in chat and via API to generate outputs like SEO research and a clean landing page.00:00 Fusion Update Overview00:51 Panels Beat Solo Models01:43 Budget Panel Near Fable02:23 How Fusion Works03:05 Live Panel Demo03:53 Benchmark Results Breakdown05:02 API Integration Ideas05:50 Boardroom SEO Example06:56 Judge Fusion Output08:05 Draco Benchmark Explained09:10 Landing Page Results10:28 Wrap Up And OffersFusion (OpenRouter) Lets You Combine Multiple Models to Reach Fable-Level Intelligence for LessThe script covers a new OpenRouter Fusion API update that runs a prompt across a parallel panel of up to eight models (with web search and bash tools), then uses a judge model to extract consensus, contradictions, unique insights, and missing coverage before returning one fused answer. Fusion is presented as a way to boost benchmark performance and reduce token costs versus relying on a single frontier model, with tests on 100 hard deep-research tasks showing much of the lift coming from synthesis rather than diversity. Examples compare solo models versus panels, including a “budget panel” of cheaper models landing within 1% of Claude Fable 5 on intelligence tests, and demonstrations of using Fusion in chat and via API to generate outputs like SEO research and a clean landing page.00:00 Fusion Update Overview00:51 Panels Beat Solo Models01:43 Budget Panel Near Fable02:23 How Fusion Works03:05 Live Panel Demo03:53 Benchmark Results Breakdown05:02 API Integration Ideas05:50 Boardroom SEO Example06:56 Judge Fusion Output08:05 Draco Benchmark Explained
What this episode covers
GLM 5.2 vs Kimi K 2.7 vs Opus 4.8: Which AI Model Builds the Best Apps?The script compares GLM 5.2, Kimi K 2.7, and Opus 4.8 by giving them identical build prompts inside an agent operating system and judging results across five tests: a Temple Run–style voxel runner (GLM 5.2 best), an inner solar system/orbit HUD simulation (Kimi K 2.7 best for zoom, speed, and customization), a liquid-in-a-bowl particle/metaball interaction (GLM 5.2 best), an Apple-style AI model landing page (GLM 5.2 best), and a neon arcade game (GLM 5.2 most fun). The narrator notes GLM 5.2 is very new and not yet on OpenRouter, contrasts origins and context windows, highlights that Kimi and GLM can be used inside AI agents unlike Claude/Opus, and concludes GLM 5.2 wins four of five tests while promoting the AI Profit Boarding community and agent OS download.00:00 Model Showdown Setup00:57 Test 1 Temple Run Runner02:01 Test 2 Solar Orbit Map03:27 Test 3 Liquid Metaballs04:39 Test 4 Apple Style Landing Page05:40 Test 5 Neon Arcade Game06:24 Benchmarks And Model Specs07:51 How To Use Them Together08:15 Get The Agent OS09:30 Community Wrap UpFusion (OpenRouter) Lets You Combine Multiple Models to Reach Fable-Level Intelligence for LessThe script covers a new OpenRouter Fusion API update that runs a prompt across a parallel panel of up to eight models (with web search and bash tools), then uses a judge model to extract consensus, contradictions, unique insights, and missing coverage before returning one fused answer. Fusion is presented as a way to boost benchmark performance and reduce token costs versus relying on a single frontier model, with tests on 100 hard deep-research tasks showing much of the lift coming from synthesis rather than diversity. Examples compare solo models versus panels, including a “budget panel” of cheaper models landing within 1% of Claude Fable 5 on intelligence tests, and demonstrations of using Fusion in chat and via API to generate outputs like SEO research and a clean landing page.00:00 Fusion Update Overview00:51 Panels Beat Solo Models01:43 Budget Panel Near Fable02:23 How Fusion Works03:05 Live Panel Demo03:53 Benchmark Results Breakdown05:02 API Integration Ideas05:50 Boardroom SEO Example06:56 Judge Fusion Output08:05 Draco Benchmark Explained09:10 Landing Page Results10:28 Wrap Up And OffersFusion (OpenRouter) Lets You Combine Multiple Models to Reach Fable-Level Intelligence for LessThe script covers a new OpenRouter Fusion API update that runs a prompt across a parallel panel of up to eight models (with web search and bash tools), then uses a judge model to extract consensus, contradictions, unique insights, and missing coverage before returning one fused answer. Fusion is presented as a way to boost benchmark performance and reduce token costs versus relying on a single frontier model, with tests on 100 hard deep-research tasks showing much of the lift coming from synthesis rather than diversity. Examples compare solo models versus panels, including a “budget panel” of cheaper models landing within 1% of Claude Fable 5 on intelligence tests, and demonstrations of using Fusion in chat and via API to generate outputs like SEO research and a clean landing page.00:00 Fusion Update Overview00:51 Panels Beat Solo Models01:43 Budget Panel Near Fable02:23 How Fusion Works03:05 Live Panel Demo03:53 Benchmark Results Breakdown05:02 API Integration Ideas05:50 Boardroom SEO Example06:56 Judge Fusion Output08:05 Draco Benchmark Explained09:10 Landing Page Results10:28 Wrap Up And OffersFusion (OpenRouter) Lets You Combine Multiple Models to Reach Fable-Level Intelligence for LessThe script covers a new OpenRouter Fusion API update that runs a prompt across a parallel panel of up to eight models (with web search and bash tools), then uses a judge model to extract consensus, contradictions, unique insights, and missing coverage before returning one fused answer. Fusion is presented as a way to boost benchmark performance and reduce token costs versus relying on a single frontier model, with tests on 100 hard deep-research tasks showing much of the lift coming from synthesis rather than diversity. Examples compare solo models versus panels, including a “budget panel” of cheaper models landing within 1% of Claude Fable 5 on intelligence tests, and demonstrations of using Fusion in chat and via API to generate outputs like SEO research and a clean landing page.00:00 Fusion Update Overview00:51 Panels Beat Solo Models01:43 Budget Panel Near Fable02:23 How Fusion Works03:05 Live Panel Demo03:53 Benchmark Results Breakdown05:02 API Integration Ideas05:50 Boardroom SEO Example06:56 Judge Fusion Output08:05 Draco Benchmark Explained
NOW PLAYING
NEW GLM 5.2 DESTROYS Claude?
No transcript for this episode yet
Similar Episodes
Mar 26, 2026 ·1m
Jan 2, 2026 ·47m
Dec 21, 2025 ·46m