PodParley PodParley
Episode 227 - LLM Evaluation: Choosing the RIGHT Model

EPISODE · Feb 14, 2025 · 38 MIN

Episode 227 - LLM Evaluation: Choosing the RIGHT Model

from Two Voice Devs · host Mark and Allen

Are you overwhelmed by the sheer number of Large Language Models (LLMs) available? Choosing the right LLM for your project isn't about picking the most popular one – it's about understanding your specific needs and rigorously evaluating your options.In this episode of Two Voice Devs, Allen Firstenberg and guest host Brad Nemer, a seasoned product manager, dive deep into the world of LLM evaluation. They go beyond the marketing buzz and explore practical tools and strategies for making informed decisions.Whether you're a developer, a product manager, or just curious about the practical applications of LLMs, this episode provides invaluable insights into making the right choices for your projects. Don't get caught up in the hype – learn how to evaluate LLMs effectively!More Info:https://www.udacity.com/blog/2025/01/how-to-choose-the-right-ai-model-for-your-product.html[00:00:00] Introduction: Meet Brad Niemer[00:00:38] Brad's Journey to Product Management & AI[00:03:12] Collaboration with Noble Ackerson and the LLM Evaluation Challenge[00:05:23] The Role of a Product Manager.[00:07:43] Product manager relation to engineering.[00:13:46] Exploring Evaluation Tools: Hugging Face[00:16:58] Exploring Evaluation Tools: Chatbot Arena (Human Evaluation)[00:20:30] Chatbot Arena: Code Generation Evaluation[00:24:43] Evaluating LLMs: Beyond Chatbots and Truth[00:26:11] Exploring Evaluation Tools: Artificial Analysis (Quality, Speed, Price)[00:28:47] Exploring Evaluation Tools: Galileo (Hallucination Report)[00:31:16] Case Study: DeepSeek and the Importance of Contextual Evaluation[00:34:53] The Future of LLM Testing and Quality Assurance[00:37:49] Wrap Up contact information.#LLM #LargeLanguageModels #AIEvaluation #ProductManagement #TechTalk #TwoVoiceDevs #HuggingFace #GenAI #GenerativeAI #ChatbotArena #ArtificialAnalysis #Galileo #DeepSeek #ChatGPT #Gemini #Mistral #Claude #ModelSelection #AIdevelopment #SoftwareDevelopment #Testing #QA #RAG #MachineLearning #NLP #Coding #TechPodcast #YouTubeTech #Developers

NOW PLAYING

Episode 227 - LLM Evaluation: Choosing the RIGHT Model

0:00 38:46

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

No similar episodes found.

Until Dice Do Us Part Seth Koch It’s Sunday Night and that means its board game date night! Seth and Jessie make time in their busy schedule to do something they love, hanging out together and playing board games. Listen in as they discuss board games, kids and marriage! Every few weeks they post a quick chat about the games they are playing and the wild and weird family they have. The episodes are kept short and offer an honest review on a game of the week. They love to focus on on co-op board games, family games and other games that play well with just two players. Rabbi Ammiel Hirsch of Stephen Wise Free Synagogue Rabbi Ammiel Hirsch Podcast of sermons by Rabbi Ammiel Hirsch, senior rabbi at Stephen Wise Free Synagogue in New York City. Rabbi Hirsch is recognized internationally for his leadership in Jewish affairs and was named by the New York Observer among “New York’s Most Influential Religious Leaders.” The coauthor of the acclaimed One People Two Worlds: A Reform Rabbi and an Orthodox Rabbi Explore the Issues that Divide Them, he previously served as executive director of the Association of Reform Zionists of America. VOICE OF... voice of idris Welcome to VOICE OF..., where impactful stories unfold. I'm Idris, a seasoned content creator, founder of QIBLAH & PROJECT QIBLAH and overall an inquisitive human. Join me in exploring inspiring stories—from self-improvement to faith, fitness to entrepreneurship and overall different life experiences. Expect honest conversations with various guests and compassionate insights. This is VOICE OF..., where stories connect and grow. Seeking inspiration and understanding? You're in the right place. The GamesIndustry.biz Microcast The GI.Biz Team Welcome to the weekly GI Microcast giving you the latest in video games news. Join James Batchelor and Chris Dring, two business journalists with more than 35 years' combined experience, as they give you their take on the biggest stories in video games. Never miss an episode and subscribe to the GI Microcast on all podcasting platforms and YouTube.
URL copied to clipboard!