EPISODE · Sep 29, 2025 · 10 MIN
Evaluating Retrieval Capabilities of Language Models [Microsoft]
from Snacks Weekly on Data Science · host Pan Wu
In this episode, we explore how to evaluate the retrieval-augmented generation (RAG) capabilities of small language models. On the business side, we discuss why RAG, long context windows, and small language models are critical for building scalable and reliable AI systems. On the technical side, we walk through the Needle-in-a-Haystack methodology and discuss key findings about retrieval performance across different models.For more details, you can refer to their published tech blog, linked here for your reference: https://medium.com/data-science-at-microsoft/evaluating-rag-capabilities-of-small-language-models-e7531b3a5061
What this episode covers
In this episode, we explore how to evaluate the retrieval-augmented generation (RAG) capabilities of small language models. On the business side, we discuss why RAG, long context windows, and small language models are critical for building scalable and reliable AI systems. On the technical side, we walk through the Needle-in-a-Haystack methodology and discuss key findings about retrieval performance across different models.For more details, you can refer to their published tech blog, linked here for your reference: https://medium.com/data-science-at-microsoft/evaluating-rag-capabilities-of-small-language-models-e7531b3a5061
NOW PLAYING
Evaluating Retrieval Capabilities of Language Models [Microsoft]
No transcript for this episode yet
Similar Episodes
Apr 22, 2025 ·32m
Feb 27, 2025 ·0m
Sep 20, 2024 ·57m
Aug 7, 2024 ·16m