Prof. Subbarao Kambhampati - LLMs don't reason, they memorize (ICML2024 2/13) episode artwork

EPISODE · Jul 29, 2024 · 1H 42M

Prof. Subbarao Kambhampati - LLMs don't reason, they memorize (ICML2024 2/13)

from Machine Learning Street Talk (MLST)

Prof. Subbarao Kambhampati argues that while LLMs are impressive and useful tools, especially for creative tasks, they have fundamental limitations in logical reasoning and cannot provide guarantees about the correctness of their outputs. He advocates for hybrid approaches that combine LLMs with external verification systems. MLST is sponsored by Brave: The Brave Search API covers over 20 billion webpages, built from scratch without Big Tech biases or the recent extortionate price hikes on search API access. Perfect for AI model training and retrieval augmentated generation. Try it now - get 2,000 free queries monthly at http://brave.com/api. TOC (sorry the ones baked into the MP3 were wrong apropos due to LLM hallucination!) [00:00:00] Intro [00:02:06] Bio [00:03:02] LLMs are n-gram models on steroids [00:07:26] Is natural language a formal language? [00:08:34] Natural language is formal? [00:11:01] Do LLMs reason? [00:19:13] Definition of reasoning [00:31:40] Creativity in reasoning [00:50:27] Chollet's ARC challenge [01:01:31] Can we reason without verification? [01:10:00] LLMs cant solve some tasks [01:19:07] LLM Modulo framework [01:29:26] Future trends of architecture [01:34:48] Future research directions Youtube version: https://www.youtube.com/watch?v=y1WnHpedi2A Refs: (we didn't have space for URLs here, check YT video description instead) Can LLMs Really Reason and Plan? On the Planning Abilities of Large Language Models : A Critical Investigation Chain of Thoughtlessness? An Analysis of CoT in Planning On the Self-Verification Limitations of Large Language Models on Reasoning and Planning Tasks LLMs Can't Plan, But Can Help Planning in LLM-Modulo Frameworks Embers of Autoregression: Understanding Large Language Models Through the Problem They are Trained to Solve "Task Success" is not Enough Partition function (number theory) (Srinivasa Ramanujan and G.H. Hardy's work) Poincaré conjecture Gödel's incompleteness theorems ROT13 (Rotate13, "rotate by 13 places") A Mathematical Theory of Communication (C. E. SHANNON) Sparks of AGI Kambhampati thesis on speech recognition (1983) PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change Explainable human-AI interaction Tree of Thoughts On the Measure of Intelligence (ARC Challenge) Getting 50% (SoTA) on ARC-AGI with GPT-4o (Ryan Greenblatt ARC solution) PROGRAMS WITH COMMON SENSE (John McCarthy) - "AI should be an advice taker program" Original chain of thought paper ICAPS 2024 Keynote: Dale Schuurmans on "Computing and Planning with Large Generative Models" (COT) The Hardware Lottery (Hooker) A Path Towards Autonomous Machine Intelligence (JEPA/LeCun) AlphaGeometry FunSearch Emergent Abilities of Large Language Models Language models are not naysayers (Negation in LLMs) The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A" Embracing negative results

Prof. Subbarao Kambhampati argues that while LLMs are impressive and useful tools, especially for creative tasks, they have fundamental limitations in logical reasoning and cannot provide guarantees about the correctness of their outputs. He advocates for hybrid approaches that combine LLMs with external verification systems. MLST is sponsored by Brave: The Brave Search API covers over 20 billion webpages, built from scratch without Big Tech biases or the recent extortionate price hikes on search API access. Perfect for AI model training and retrieval augmentated generation. Try it now - get 2,000 free queries monthly at http://brave.com/api. TOC (sorry the ones baked into the MP3 were wrong apropos due to LLM hallucination!) [00:00:00] Intro [00:02:06] Bio [00:03:02] LLMs are n-gram models on steroids [00:07:26] Is natural language a formal language? [00:08:34] Natural language is formal? [00:11:01] Do LLMs reason? [00:19:13] Definition of reasoning [00:31:40] Creativity in reasoning [00:50:27] Chollet's ARC challenge [01:01:31] Can we reason without verification? [01:10:00] LLMs cant solve some tasks [01:19:07] LLM Modulo framework [01:29:26] Future trends of architecture [01:34:48] Future research directions Youtube version: https://www.youtube.com/watch?v=y1WnHpedi2A Refs: (we didn't have space for URLs here, check YT video description instead) Can LLMs Really Reason and Plan? On the Planning Abilities of Large Language Models : A Critical Investigation Chain of Thoughtlessness? An Analysis of CoT in Planning On the Self-Verification Limitations of Large Language Models on Reasoning and Planning Tasks LLMs Can't Plan, But Can Help Planning in LLM-Modulo Frameworks Embers of Autoregression: Understanding Large Language Models Through the Problem They are Trained to Solve "Task Success" is not Enough Partition function (number theory) (Srinivasa Ramanujan and G.H. Hardy's work) Poincaré conjecture Gödel's incompleteness theorems ROT13 (Rotate13, "rotate by 13 places") A Mathematical Theory of Communication (C. E. SHANNON) Sparks of AGI Kambhampati thesis on speech recognition (1983) PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change Explainable human-AI interaction Tree of Thoughts On the Measure of Intelligence (ARC Challenge) Getting 50% (SoTA) on ARC-AGI with GPT-4o (Ryan Greenblatt ARC solution) PROGRAMS WITH COMMON SENSE (John McCarthy) - "AI should be an advice taker program" Original chain of thought paper ICAPS 2024 Keynote: Dale Schuurmans on "Computing and Planning with Large Generative Models" (COT) The Hardware Lottery (Hooker) A Path Towards Autonomous Machine Intelligence (JEPA/LeCun) AlphaGeometry FunSearch Emergent Abilities of Large Language Models Language models are not naysayers (Negation in LLMs) The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A" Embracing negative results

NOW PLAYING

Prof. Subbarao Kambhampati - LLMs don't reason, they memorize (ICML2024 2/13)

0:00 1:42:27

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

French Your Way Jessica: Native French teacher founder of French Your Way Boost your French listening skills and test your comprehension with this one of a kind series of podcasts. Get the chance to listen to a real conversation between native speakers talking at normal speed AND customise your learning experience through carefully designed sets of questions (2 levels of difficulty) available for download at www.frenchvoicespodcast.com. All interviews also come with the transcript. French teacher Jessica interviews native speakers of French from around the world who share a bit of their life and passion. Where else would you meet in one same place a French yoga teacher based in Melbourne, a soap manufacturer from Provence, or a couple cycling around the world? Kaizen Blueprint Aldo Chandra "Kaizen" is a Japanese term for continuous improvement. This podcast provides a blueprint to learn about health, wealth, relationships and everything else in between. Through our podcast, we strive to inspire, educate, and motivate our audience to cultivate a mindset of lifelong learning, productivity, and personal development. By sharing insights, strategies, and practical tips, we aim to guide listeners on their journey towards realizing their fullest potential, fostering success, and creating lasting positive change. One Man Went To Row PepperDawesMedia Follow the journey, from training to finish line, of a man from Derby, UK who is going from having only ever rowed on a machine to rowing 3000 miles solo across the Atlantic...just after his 70th birthday! Humanizing Change Tremendousness Join us each episode as we talk with innovators in their respective fields about their unique journeys and how they humanize change in their own work, right here, on Humanizing Change.

Frequently Asked Questions

How long is this episode of Machine Learning Street Talk (MLST)?

This episode is 1 hour and 42 minutes long.

When was this Machine Learning Street Talk (MLST) episode published?

This episode was published on July 29, 2024.

What is this episode about?

Prof. Subbarao Kambhampati argues that while LLMs are impressive and useful tools, especially for creative tasks, they have fundamental limitations in logical reasoning and cannot provide guarantees about the correctness of their outputs. He...

Can I download this Machine Learning Street Talk (MLST) episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!