All Episodes
AI on Air — 79 episodes
Shadow AI
Meta AI's V-JEPA 2: World Models for Understanding and Planning
NovelSeek Autonomous Scientific Research Framework
Qwen2.5-Math RLVR: Learning from Errors
AlphaEvolve: A Gemini-Powered Coding Agent
OpenAI Codex: Parallel Coding in ChatGPT
Agentic AI Design Patterns
Machine Learning for High-Risk Pregnancy Prediction
AI Mobile Edge Offloading for QoE and Energy Efficiency
Blockchain Chatbot CVD Screening
Deep Learning for Mammographic Breast Density Prediction
RLHF for Large Language Model Fine-Tuning
UB-Mesh: Advancing LLM Training Infrastructure
FASTCURL: Reinforcement Learning for Enhanced AI Reasoning
National AI for Cardiovascular Care: Nature Medicine Analysis
Vision-Language Reward Models: Advancements and Benchmarking
Advancing Vision-Language Reward Models
Mix-LN: A Hybrid Normalization Technique
Direct Q-Function Optimization for LLMs
RAG Attacks on LLMs
SmolAgents: AI Agents in Few Lines of Code
ByteDance's 1.58-bit FLUX AI Model
HuatuoGPT-o1: Advanced Medical Reasoning
FDA Authorizes AI Sepsis Detection Tool
Safe and Efficient Agentic AI
Apple's AI Strategy
Mix-LN: Hybrid Normalization for Transformers
LOTUS 1.0.0: Open-Source Query Engine
OpenAI o3: A Measured Advancement in AI Reasoning
LLM Alignment Faking: A New Threat
TOMG-Bench: A New AI Benchmark for Molecule Generation
Multi-Agent AI Frameworks
Alibaba vs. OpenAI: The AI Race Heats Up
Gemini 2.0: AI Research Assistant Capabilities
Maya: An Open-Source Multilingual AI Model
EXAONE 3.5: Enhanced Bilingual AI
Hugging Face TGI v3.0: Faster Text Generation
Density: A New Metric for Evaluating LLMs
Snowflake's Arctic Embed 2.0
ALAMA: Adaptive Language Model with Auxiliary Memory
Alibaba's AI Challenge to OpenAI
Building Effective AI Agents
Evaluating and Improving LLMs: Four Novel Approaches
AI Scientists: Revolutionizing Scientific Research
TamGen: AI for Antibiotic Discovery
SEALONG: Extending LLM Context Windows
AI Unveils Hidden Climate Extremes
Microsoft GraphRAG: Revolutionizing Data Analysis
Meet OpenCoder: A Completely Open-Source Code LLM Built on the Transparent Data Process Pipeline and Reproducible Dataset
New Scaling Laws for Optimizing Model and Dataset Proportions in Behavior Cloning and World Modeling Tasks
NVIDIA Launches LLaMA-Mesh, a Unified 3D Mesh Generation Method Using LLMs
BLIP3-KALE: An Open-Source Dataset of 218 Million Image-Text Pairs Transforming Image Captioning with Knowledge-Augmented Dense Descriptions
A Robust AI Solution for Managing Memory Constraints and Improving Classification Accuracy in Transformer-Based NLP Models
This AI Paper by Inria Introduces the Tree of Problems: A Simple Yet Effective Framework for Complex Reasoning in Language Models
Is Your LLM Agent Enterprise-Ready? Salesforce AI Research Introduces CRMArena
Databricks Mosaic Research Examines Long-Context Retrieval-Augmented Generation
RT-Affordance: A Hierarchical Method that Uses Affordances as an Intermediate Representation for Policies
Researchers at Peking University Introduce A New AI Benchmark for Evaluating Numerical Understanding and Processing in LLM
FrontierMath: The Benchmark that Highlights AI’s Limits in Mathematics
Databricks Mosaic Research Examines Long-Context Retrieval-Augmented Generation: How Leading AI Models Handle Expansive Information for Improved Response Accuracy
UniMTS: A Unified Pre-Training Procedure for Motion Time Series that Generalizes Across Diverse Device Latent Factors and Activities
Meet Hawkish 8B: A New Financial Domain Model that can Pass CFA Level 1 and Outperform Meta Llama-3.1-8B-Instruct in Math & Finance Benchmarks
MiniCTX: Advancing Context-Dependent Theorem Proving in Large Language Models
How TrigFlow’s Innovative Framework Narrowed the Gap with Leading Diffusion Models Using Just Two Sampling Steps
MathGAP: An Evaluation Benchmark for LLMs’ Mathematical Reasoning Using Controlled Proof Depth, Width, and Complexity for Out-of-Distribution Tasks
Can LLMs Follow Instructions Reliably? A Look at Uncertainty Estimation Challenges
Zhipu AI Releases GLM-4-Voice: A New Open-Source End-to-End Speech Large Language Model
Meta AI Researchers Introduce Token-Level Detective Reward Model (TLDR) to Provide Fine-Grained Annotations for Large Vision Language Models
Google Researchers Introduce UNBOUNDED: An Interactive Generative Infinite Game based on Generative AI Models
This AI Paper Explores If Human Visual Perception can Help Computer Vision Models Outperform in Generalized Tasks
Microsoft To Launch 'AI Agents' to Help You Handle Routine Tasks
IBM unveils new open source AI ‘Granite 3.0’ models for business
Refined Local Learning Coefficients (rLLCs): A Novel Machine Learning Approach to Understanding the Development of Attention Heads in Transformers
This AI Research from Cohere for AI Compares Merging vs Data Mixing as a Recipe for Building High-Performant Aligned LLMs
Are Brains and AI Converging?—an excerpt from ‘ChatGPT and the Future of AI: The Deep Language Revolution’
CREAM: A New Self-Rewarding Method that Allows the Model to Learn more Selectively and Emphasize on Reliable Preference Data
Embed-then-Regress: A Versatile Machine Learning Approach for Bayesian Optimization Using String-Based In-Context Regression
Graph-Constrained Reasoning (GCR): A Novel AI Framework that Bridges Structured Knowledge in Knowledge Graphs with Unstructured Reasoning in LLMs
MMed-RAG: A Versatile Multimodal Retrieval-Augmented Generation System Transforming Factual Accuracy in Medical Vision-Language Models Across Multiple Domains