Off-the-Shelf Large Language Models Are Unreliable Judges – Jonathan Choi (USC / WashU) episode artwork

EPISODE · Mar 1, 2026 · 14 MIN

Off-the-Shelf Large Language Models Are Unreliable Judges – Jonathan Choi (USC / WashU)

from Talking law and economics at ETH Zurich · host ETH Center for Law & Economics

With the rapid rise of artificial intelligence, large language models (LLMs) are increasingly being considered for tasks once thought to be uniquely human—including legal interpretation. The idea of “AI judges” suggests appealing possibilities: consistent, fast, and ostensibly unbiased answers to legal questions. But how reliable are these models? Can their judgments truly be trusted? And do they withstand careful empirical scrutiny?In this episode of the CLE Vlog Series, Prof. Jonathan Choi (University of Southern California & Washington University, St. Louis) joins Alessandro Tacconelli (ETH Zurich) to discuss his paper, “Off-the-Shelf Large Language Models Are Unreliable Judges.” Prof. Choi presents findings from a series of empirical experiments designed to test how well LLMs perform as legal interpreters. His results reveal that model judgments are highly sensitive to prompt phrasing, output processing methods, and training choices. Moreover, post-training adjustments in today’s most widely used models can push LLMs’ assessments far from empirically grounded predictions of language use. These insights raise serious questions about the credibility of LLMs in legal interpretation and cast doubt on their ability to capture the “ordinary meaning” of legal texts.Paper Reference:Jonathan Choi – University of Southern California / Washington University (St. Louis)Large Language Models Are Unreliable Judgeshttps://papers.ssrn.com/sol3/papers.cfm?abstract_id=5188865Audio Credits for Trailer:AllttA by AllttA https://youtu.be/ZawLOcbQZ2w

With the rapid rise of artificial intelligence, large language models (LLMs) are increasingly being considered for tasks once thought to be uniquely human—including legal interpretation. The idea of “AI judges” suggests appealing possibilities: consistent, fast, and ostensibly unbiased answers to legal questions. But how reliable are these models? Can their judgments truly be trusted? And do they withstand careful empirical scrutiny?In this episode of the CLE Vlog Series, Prof. Jonathan Choi (University of Southern California & Washington University, St. Louis) joins Alessandro Tacconelli (ETH Zurich) to discuss his paper, “Off-the-Shelf Large Language Models Are Unreliable Judges.” Prof. Choi presents findings from a series of empirical experiments designed to test how well LLMs perform as legal interpreters. His results reveal that model judgments are highly sensitive to prompt phrasing, output processing methods, and training choices. Moreover, post-training adjustments in today’s most widely used models can push LLMs’ assessments far from empirically grounded predictions of language use. These insights raise serious questions about the credibility of LLMs in legal interpretation and cast doubt on their ability to capture the “ordinary meaning” of legal texts.Paper Reference:Jonathan Choi – University of Southern California / Washington University (St. Louis)Large Language Models Are Unreliable Judgeshttps://papers.ssrn.com/sol3/papers.cfm?abstract_id=5188865Audio Credits for Trailer:AllttA by AllttA https://youtu.be/ZawLOcbQZ2w

NOW PLAYING

Off-the-Shelf Large Language Models Are Unreliable Judges – Jonathan Choi (USC / WashU)

0:00 14:35

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

MG Show MG Show The MG Show, hosted by Jeffrey Pedersen and Shannon Townsend, is a leading alternative media platform dedicated to uncovering the truth behind today’s most pressing political issues. Launched in 2019, the show has grown exponentially, offering unfiltered insights, comprehensive research, and real-time analysis. With a commitment to independent journalism and factual integrity, the MG Show empowers its audience with knowledge and encourages active participation in the political discourse. Breaking News Show | eTurboNews Juergen Thomas Steinmetz News is relevant to the global travel and tourism industry, human rights and global issues.Breaking news when it happens and only from the source. Eat to Live Jenna Fuhrman, Dr. Fuhrman Our health is our most precious gift and smart nutrition can change your life. Each month, join Dr. Fuhrman and his daughter, Jenna Fuhrman as they discuss important topics in the world of nutrition. Eat to Live will change the way you eat and think about food. French Your Way Jessica: Native French teacher founder of French Your Way Boost your French listening skills and test your comprehension with this one of a kind series of podcasts. Get the chance to listen to a real conversation between native speakers talking at normal speed AND customise your learning experience through carefully designed sets of questions (2 levels of difficulty) available for download at www.frenchvoicespodcast.com. All interviews also come with the transcript. French teacher Jessica interviews native speakers of French from around the world who share a bit of their life and passion. Where else would you meet in one same place a French yoga teacher based in Melbourne, a soap manufacturer from Provence, or a couple cycling around the world?

Frequently Asked Questions

How long is this episode of Talking law and economics at ETH Zurich?

This episode is 14 minutes long.

When was this Talking law and economics at ETH Zurich episode published?

This episode was published on March 1, 2026.

What is this episode about?

With the rapid rise of artificial intelligence, large language models (LLMs) are increasingly being considered for tasks once thought to be uniquely human—including legal interpretation. The idea of “AI judges” suggests appealing possibilities:...

Can I download this Talking law and economics at ETH Zurich episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!