706: Large Language Model Leaderboards and Benchmarks

from Super Data Science: ML & AI Podcast with Jon Krohn · host Jon Krohn

In this episode, Caterina Constantinescu dives deep into Large Language Models (LLMs), spotlighting top leaderboards, evaluation benchmarks, and real-world user perceptions. Plus, discover the challenges of dataset contamination and the intricacies of platforms like HELM and Chatbot Arena.Additional materials: www.superdatascience.com/706Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.

Episode metadata supplied by the publisher feed · Published Aug 18, 2023

Embed this episode

Attribution link and audio player

NOW PLAYING

706: Large Language Model Leaderboards and Benchmarks

0:00 33:27

1×

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

Share this episode

Similar Episodes

No similar episodes found.

Similar Podcasts

No similar podcasts found.

Frequently Asked Questions

How long is this episode of Super Data Science: ML & AI Podcast with Jon Krohn?

This episode is 33 minutes long.

When was this Super Data Science: ML & AI Podcast with Jon Krohn episode published?

This episode was published on August 18, 2023.

Can I download this Super Data Science: ML & AI Podcast with Jon Krohn episode?

Yes. Use the download control on the episode player to save the publisher-provided media file.

URL copied to clipboard!