All Episodes
Learning GenAI via SOTA Papers — 183 episodes
EP183: AI coding agents cheat with keywords
EP182: AI logic is its weakest link
EP181: Small models beating GPT-5 with logic
EP180: How AI agents rewrite their code
EP179: AIBuildAI Builds New AI Models From Scratch
EP178: AI agents reaching silent latent consensus
EP177: CAPO math stops overconfident AI lies
EP176: Trigonometry fixes the AI memory bottleneck
EP175: How AI models teach themselves reasoning
EP174: 1-bit Bonsai brings powerful AI offline
EP173: AI models diagnosing diseases from blank scans
EP172: How HyperAgents rewrite their own code
EP171: Helium makes AI agent workflows 40x faster
EP170: Qwen3.5 Multimodal Agent
EP169: Cybersecurity Risks of Autonomous AI Agents
EP168: Turning AI Agents into Mathematical Functions
EP167: Why AI models ignore visual evidence
EP166: The Auton solution to the integration paradox
EP165: Translating hidden AI logic into English
EP164: [LACONIC] Teaching AI to stop overthinking
EP163: Why AI Models Only Remember Five Percent
EP162: AI agents beat humans with malicious skills
EP161: Small AI Judges Beat Massive Coding Giants
EP160: [AgentSys] Securing AI agents with hierarchical memory
EP159: Brute force scale dominates the AI frontier
EP158: The hidden blind spots of AI logic
EP157: [AgentHeLLM] Protecting drivers from hijacked vehicle AI
EP156: [Uncertainty Quantification] How AI Agents Know They Are Guessing
EP155: [Agentic Proposing] Small models beat giants with logic bricks
EP154: [FS-Researcher] Giving AI agents a file system
EP153: [SERA] Training AI coding agents on untested code
EP152: DeepVerifier forces AI to check its work
EP151: [MagicGUI-RMS] AI agents that think before they click
EP150: The Leap to Autonomous Agentic Reasoning
EP149: [IDRBench] Interactive AI beats lone wolf models
EP148: How AI masters math through self-correction
EP147: [DeepSynth-Eval] AI fails at deep research synthesis
EP146: How InfiAgent solves the AI memory bottleneck
EP145: [LongDA] Why smart AI fails at messy data
EP144: [Evo-Memory] Building AI agents with self-evolving memory.
EP143: Your AI will blackmail you to survive
EP142: [DR-Arena] A ruthless arena for deep research agents
EP141: [AIRS-Bench] AI agents beat human research benchmarks
EP140: [LeWorldModel] AI learns physics on one GPU
EP139: Mamba-3 Fixes the Transformer Memory Bottleneck
EP138: [Mamba-2] Transformers and SSMs Are the Same Engine
EP137: Attention Residuals Solve the LLM Depth Bottleneck
EP136: Modular skills for autonomous AI agents
EP135: [SoK] Curing AI Amnesia with Agentic Skills
EP134: Autonomous AI squads building software
EP133: RelayLLM Slashes AI Costs With Collaborative Decoding
EP132: How Autonomous LLM Agents Actually Work
EP131: MUSE creates self evolving AI agents
EP130: [GAP] Graph-based planning for faster AI agents
EP129: Why AI agents fail half the time
EP128: MCP-Zero lets AI find its own tools
EP127: Why tool use makes AI less intelligent
EP126: OrcaLoca locates bugs in massive codebases
EP125: Why AI Needs an Agent Computer Interface
EP124: FRIDAY the AI that runs your computer
EP123: MemGPT Turns LLMs into Operating Systems
EP122: The Four Pillars of LLM Autonomous Agents
EP121: How ToolLLaMA mastered 16000 real world APIs
EP120: How Reflexion agents learn through verbal feedback
EP119: HuggingGPT Turns LLMs Into AI Managers
EP118: The AI Memory Wall Crisis
EP117: AI agents learn through textual reflection
EP116: Why AI struggles with empathy and interruptions
EP115: Dr.LLM brings dynamic depth to AI
EP114: FlashAttention-4 Solves Blackwell Hardware Bottlenecks
EP113: How FlashAttention-3 Doubles H100 Speed
EP112: GPT 5.4 Outperforms Human Professionals
EP111: Claude Opus 4.6 Runs Businesses and Catches Manipulation
EP110: Single agents beat expensive multi agent teams
EP109: The Rise of Agentic Reasoning
EP108: GPT-5 Can Lie and Play Dumb
EP107: DeepMind’s SIMA 2 Masters Unseen Video Games
EP106: Fixing AI Agents With Symbolic Guardrails
EP105: iStar Autonomous Agents Grading Their Own Homework
EP104: WebExplorer Beats Giants at Web Research
EP103: Why AI Agents Think Themselves To Death
EP102: Gemini 2.5 Thinks Before It Speaks
EP101: Kimi k1.5 Breaks the AI Data Wall
EP100: Meta's Llama 4 Herd Ends Monolithic Models
EP099: Is AI Thinking Just Expensive Noise
EP098: OpenAI o3 Hacked Its Own Grading System
EP097: DeepSeek R1 Taught Itself to Reason
EP096: Gemini 1.5 Pro's 10 Million Token Window
EP095: Microsoft Phi-4 Beats Giants With Synthetic Data
EP094: DeepSeek-V3 Rivals GPT-4 for $6 Million
EP093: How OpenAI o1 Cracked the Strawberry Cipher
EP092: BitNet b1.58 Replaces Multiplication With Addition
EP091: Qwen 2.5 Beats Llama With Synthetic Data
EP090: Pixtral 12B Beats Llama With Better Eyesight
EP089: Qwen2-VL Gives AI Native Eyesight
EP088: Qwen2 Beats Llama-3 Through Data Quality
EP087: Meta's Chameleon Unifies Text and Images
EP086: DeepSeek-V2 Breaks The Impossible Triangle
EP085: Aya 23 Breaks The Curse Of Multilinguality
EP084: Microsoft Phi-3 Fits Supercomputing in Your Pocket
EP083: How Meta Engineered the Llama 3 Herd
EP082: Command R Plus The Verifiable Enterprise Agent
EP081: Replacing MLPs With Interpretable KANs
EP080: Jamba Hybrid Solves Transformer Memory Limits
EP079: DBRX Beats GPT-3.5
EP078: Claude 3 Knew It Was Being Tested
EP077: Google Squeezes Gemini Into Your Laptop
EP076: OLMo Cracks Open the AI Black Box
EP075: Microsoft Phi Beats Giants With Synthetic Textbooks
EP074: How Gemini Beat Human Experts
EP073: Mixtral 8x7B Sparse Experts Beat Giants
EP072: Mamba Solves The Transformer's Fatal Flaw
EP071: How Zephyr-7B Beat Llama-70B
EP070: Mistral 7B Beats Llama 2 13B
EP069: Alibaba's Qwen Specialized Models Beat Generalists
EP068: vLLM Fixes the KV Cache Bottleneck
EP067: FlashAttention-2 Unlocks Massive Context Windows
EP066: Llama 2 Ghost Attention And Safety Secrets
EP065: Teaching Small AI To Think Like Giants
EP064: Synthetic Textbooks Break AI Scaling Laws
EP063: RWKV Smashes the Transformer Memory Ceiling
EP062: VOYAGER AI Masters Minecraft by Writing Code
EP061: Fine-Tuning LLaMA 65B on One GPU
EP060: Direct Preference Optimization Replaces RLHF
EP059: Tree of Thoughts Unlocks System 2 Thinking
EP058: Inside the Autonomous AI Town of Smallville
EP057: Blind GPT-4 Taught LLaVA To See
EP056: Pythia Turns AI Alchemy Into Chemistry
EP055: Can GPT-4 Fairly Judge Other AI
EP054: Alpaca - Stanford Built a $600 GPT Clone
EP053: Sparks of AGI in Early GPT-4
EP052: GPT-4 Bar Exam and Visual Reasoning
EP051: ControlNet Solves Spatial Control With Zero Convolutions
EP050: How Meta's LLaMA Beat GPT-3
EP049: Toolformer Teaches Itself to Use APIs
EP048: BLIP-2 Teaches Frozen Models to See
EP047: Bootstrapping AI With Self-Generated Instructions
EP046: Training AI With A Constitution
EP045: BLOOM The Open Source Rival To GPT-3
EP044: How ReAct Synergizes Reasoning and Acting
EP043: Weak Supervision Made OpenAI Whisper Robust
EP042: Running 175B Models on Consumer Hardware
EP041: FlashAttention Smashes the AI Memory Wall
EP040: Meta's Open Source GPT-3 Replica
EP039: Flamingo Unlocks Few-Shot Visual Reasoning
EP038: PaLM's 540 Billion Parameters Unlock Reasoning
EP037: DeepMind Chinchilla Ends The Parameter Wars
EP036: How 40 People Taught GPT-3 Manners
EP035: How Google LaMDA Learned To Use Tools
EP034: Chain of Thought Prompting Unlocks Reasoning
EP033: Democratizing Image Generation with Latent Diffusion
EP032: WebGPT Fights Hallucinations With Web Search
EP031: DeepMind RETRO Swaps Memorization For Retrieval
EP030: DeepMind's Gopher Exposes Limits of Scale
EP029: Instruction Tuning Unlocked Zero-Shot Learning
EP028: Train Short for Infinite Context
EP027: From Creative Writer to Logic Engine
EP026: LoRA Fine-Tunes Massive Models Without Supercomputers
EP025: RoPE Solves Sequence by Rotating Vectors
EP024: OpenAI CLIP Bridges Language and Vision
EP023: Scaling Switch Transformers to Trillion Parameters
EP022: DALL-E Treats Images Like Language
EP021: Vision Transformers Beat CNNs at Scale
EP020: Big Bird Scales Transformers With Sparse Attention
EP019: Facebook's Linformer Solves the Attention Bottleneck
EP018: Turning Digital Static Into Images With Diffusion
EP017: RAG Gives AI a Library Card
EP016: GPT-3 Learns From Examples Without Retraining
EP015: Longformer Smashes the 512 Token Barrier
EP014: ELECTRA Beats GPT On One GPU
EP013: Reformer Cracked the Transformer Memory Wall
EP012: Google T5 Turns Every Task Into Text
EP011: ZeRO Solved the Trillion Parameter Memory Wall
EP010: ALBERT Outperforms BERT With Parameter Sharing
EP009: Slicing the AI Brain with Megatron-LM
EP008: RoBERTa Proves BERT Was Just Undertrained
EP007: How GPT-2 Hallucinated Ovid's Unicorn
EP006: Transformer-XL Cures AI Amnesia
EP005: How BERT Mastered Language by Hiding Words
EP004: How 7000 Unpublished Books Birthed GPT
EP003: How ELMo Made Word Vectors Dynamic
EP002: ULMFiT Was the ImageNet Moment for Text
EP001: How Transformers Smashed the Sequential Bottleneck