EPISODE · Dec 13, 2024 · 11 MIN
Abstracts: NeurIPS 2024 with Jindong Wang and Steven Euijong Whang
from Microsoft Research Podcast · host Researchers across the Microsoft research community
Researcher Jindong Wang and Associate Professor Steven Euijong Whang explore the NeurIPS 2024 work ERBench. ERBench leverages relational databases to create LLM benchmarks that can verify model rationale via keywords in addition to checking answer correctness. Read the paperGet datasets and codes
NOW PLAYING
Abstracts: NeurIPS 2024 with Jindong Wang and Steven Euijong Whang
No transcript for this episode yet
Similar Episodes
Mar 26, 2026 ·1m
Jan 2, 2026 ·47m
Dec 21, 2025 ·46m