Jay Alammar on LLMs, RAG, and AI Engineering

An episode of the Machine Learning Street Talk (MLST) podcast, hosted by Machine Learning Street Talk (MLST), titled "Jay Alammar on LLMs, RAG, and AI Engineering" was published on August 11, 2024 and runs 57 minutes.

August 11, 2024 ·57m · Machine Learning Street Talk (MLST)

0:00 / 0:00

Summary

Jay Alammar, renowned AI educator and researcher at Cohere, discusses the latest developments in large language models (LLMs) and their applications in industry. Jay shares his expertise on retrieval augmented generation (RAG), semantic search, and the future of AI architectures. MLST is sponsored by Brave: The Brave Search API covers over 20 billion webpages, built from scratch without Big Tech biases or the recent extortionate price hikes on search API access. Perfect for AI model training and retrieval augmentated generation. Try it now - get 2,000 free queries monthly at http://brave.com/api. Cohere Command R model series: https://cohere.com/command Jay Alamaar: https://x.com/jayalammar Buy Jay's new book here! Hands-On Large Language Models: Language Understanding and Generation https://amzn.to/4fzOUgh TOC: 00:00:00 Introduction to Jay Alammar and AI Education 00:01:47 Cohere's Approach to RAG and AI Re-ranking 00:07:15 Implementing AI in Enterprise: Challenges and Solutions 00:09:26 Jay's Role at Cohere and the Importance of Learning in Public 00:15:16 The Evolution of AI in Industry: From Deep Learning to LLMs 00:26:12 Expert Advice for Newcomers in Machine Learning 00:32:39 The Power of Semantic Search and Embeddings in AI Systems 00:37:59 Jay Alammar's Journey as an AI Educator and Visualizer 00:43:36 Visual Learning in AI: Making Complex Concepts Accessible 00:47:38 Strategies for Keeping Up with Rapid AI Advancements 00:49:12 The Future of Transformer Models and AI Architectures 00:51:40 Evolution of the Transformer: From 2017 to Present 00:54:19 Preview of Jay's Upcoming Book on Large Language Models Disclaimer: This is the fourth video from our Cohere partnership. We were not told what to say in the interview, and didn't edit anything out from the interview. Note also that this combines several previously unpublished interviews from Jay into one, the earlier one at Tim's house was shot in Aug 2023, and the more recent one in Toronto in May 2024. Refs: The Illustrated Transformer https://jalammar.github.io/illustrated-transformer/ Attention Is All You Need https://arxiv.org/abs/1706.03762 The Unreasonable Effectiveness of Recurrent Neural Networks http://karpathy.github.io/2015/05/21/rnn-effectiveness/ Neural Networks in 11 Lines of Code https://iamtrask.github.io/2015/07/12/basic-python-network/ Understanding LSTM Networks (Chris Olah's blog post) http://colah.github.io/posts/2015-08-Understanding-LSTMs/ Luis Serrano's YouTube Channel https://www.youtube.com/channel/UCgBncpylJ1kiVaPyP-PZauQ Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks https://arxiv.org/abs/1908.10084 GPT (Generative Pre-trained Transformer) models https://jalammar.github.io/illustrated-gpt2/ https://openai.com/research/gpt-4 BERT (Bidirectional Encoder Representations from Transformers) https://jalammar.github.io/illustrated-bert/ https://arxiv.org/abs/1810.04805 RoPE (Rotary Positional Encoding) https://arxiv.org/abs/2104.09864 (Linked paper discussing rotary embeddings) Grouped Query Attention https://arxiv.org/pdf/2305.13245 RLHF (Reinforcement Learning from Human Feedback) https://openai.com/research/learning-from-human-preferences https://arxiv.org/abs/1706.03741 DPO (Direct Preference Optimization) https://arxiv.org/abs/2305.18290

Episode Description

MLST is sponsored by Brave:

The Brave Search API covers over 20 billion webpages, built from scratch without Big Tech biases or the recent extortionate price hikes on search API access. Perfect for AI model training and retrieval augmentated generation. Try it now - get 2,000 free queries monthly at http://brave.com/api.

Cohere Command R model series: https://cohere.com/command

Jay Alamaar:

https://x.com/jayalammar

Buy Jay's new book here!

Hands-On Large Language Models: Language Understanding and Generation

https://amzn.to/4fzOUgh

TOC:

00:00:00 Introduction to Jay Alammar and AI Education

00:01:47 Cohere's Approach to RAG and AI Re-ranking

00:07:15 Implementing AI in Enterprise: Challenges and Solutions

00:09:26 Jay's Role at Cohere and the Importance of Learning in Public

00:15:16 The Evolution of AI in Industry: From Deep Learning to LLMs

00:26:12 Expert Advice for Newcomers in Machine Learning

00:32:39 The Power of Semantic Search and Embeddings in AI Systems

00:37:59 Jay Alammar's Journey as an AI Educator and Visualizer

00:43:36 Visual Learning in AI: Making Complex Concepts Accessible

00:47:38 Strategies for Keeping Up with Rapid AI Advancements

00:49:12 The Future of Transformer Models and AI Architectures

00:51:40 Evolution of the Transformer: From 2017 to Present

00:54:19 Preview of Jay's Upcoming Book on Large Language Models

Disclaimer: This is the fourth video from our Cohere partnership. We were not told what to say in the interview, and didn't edit anything out from the interview. Note also that this combines several previously unpublished interviews from Jay into one, the earlier one at Tim's house was shot in Aug 2023, and the more recent one in Toronto in May 2024.

Refs:

The Illustrated Transformer

https://jalammar.github.io/illustrated-transformer/

Attention Is All You Need

https://arxiv.org/abs/1706.03762

The Unreasonable Effectiveness of Recurrent Neural Networks

http://karpathy.github.io/2015/05/21/rnn-effectiveness/

Neural Networks in 11 Lines of Code

https://iamtrask.github.io/2015/07/12/basic-python-network/

Understanding LSTM Networks (Chris Olah's blog post)

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Luis Serrano's YouTube Channel

https://www.youtube.com/channel/UCgBncpylJ1kiVaPyP-PZauQ

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

https://arxiv.org/abs/1908.10084

GPT (Generative Pre-trained Transformer) models

https://jalammar.github.io/illustrated-gpt2/

https://openai.com/research/gpt-4

BERT (Bidirectional Encoder Representations from Transformers)

https://jalammar.github.io/illustrated-bert/

https://arxiv.org/abs/1810.04805

RoPE (Rotary Positional Encoding)

https://arxiv.org/abs/2104.09864 (Linked paper discussing rotary embeddings)

Grouped Query Attention

https://arxiv.org/pdf/2305.13245

RLHF (Reinforcement Learning from Human Feedback)

https://openai.com/research/learning-from-human-preferences

https://arxiv.org/abs/1706.03741

DPO (Direct Preference Optimization)

https://arxiv.org/abs/2305.18290

Share this episode

Similar Episodes

No similar episodes found.

Similar Podcasts

Super Data Science: ML & AI Podcast with Jon Krohn Jon Krohn The latest machine learning, A.I., and data career topics from across both academia and industry are brought to you by host Dr. Jon Krohn on the Super Data Science Podcast. As the quantity of data on our planet doubles every couple of years and with this trend set to continue for decades to come, there's an unprecedented opportunity for you to make a meaningful impact in your lifetime. In conversation with the biggest names in the data science industry, Jon cuts through hype to fuel that professional impact.Whether you're curious about getting started in a data career or you're a deep technical expert, whether you'd like to understand what A.I. is or you'd like to integrate more data-driven processes into your business, we have inspiring guests and lighthearted conversation for you to enjoy.We cover tools, techniques, and implementation tricks across data collection, databases, analytics, predictive modeling, visualization, software engineering, r Your Data Teacher Podcast Your Data Teacher A podcast about data science, machine learning, artificial intelligence, statistics and everything related to data.Home Page: https://www.yourdatateacher.com Undercovers Vibe Machine Media A podcast where we discuss amazing album artwork with the artists behind them. A fascinating look at how the concepts came together, the interactions with the artists the covers were created for, inspirations, what album covers they wish they'd created and what acts they'd like to create artwork for! Werkleitz Festival 2021 Werkleitz How discontinuity and historical contexts, disorder, and machine learning collide is the topic of the podcasts with artists and scholars published continuously during the Werkleitz Festival 2021 and later on.