Improve your prompts by hill-climbing with Evaluations

from Podkey WWDC 2026

A Podkey summary of Improve your prompts by hill-climbing with Evaluations, from WWDC 2026.Today’s thread is really about one deceptively simple question: how do you know your AI evaluator is actually judging things the way a human would? The big themes are drift between model and expert ratings, a more honest way to measure agreement with Cohen’s kappa, and a very practical process of improving prompts one change at a time. There’s also a useful reminder that better scores in one area can quietly make another area worse, and that tiny datasets can flatter you more than they should.When the judge starts driftingWhy raw agreement isn’t enoughPrompt tuning like an experimentControl versus experimentalRelevance and usefulness are not the same thingOne change at a timeGiving the model a lookup toolThe dataset is probably too smallBetter aggregation, better decisionsUsing generated samples to test tools tooThis podcast was created with Podkey. Make your own at https://podkey.fm

NOW PLAYING

0:00 7:17

1×

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

Share this episode

Similar Episodes

蝉鸣乐队·盛夏季——青春 / 温馨 / 娱乐圈 / 搞笑 / 热血

May 14, 2026 ·360m

白雾【1～35】袁铭喆x陈啟刚多人有声剧商配

May 14, 2026 ·310m

🌟D-电子竞技存在一见钟情吗（刘思岑&林予曦）广播剧第一季完结现代电竞耽美广播剧总播放量：689万 | 已完结，共8期标签：游戏 / 电竞 / 年上 / 暗恋 / 甜

May 14, 2026 ·134m

广播剧《百妖谱》第一季玄幻冒险生存温馨虐日常

May 14, 2026 ·205m

【🌸花絮】《千秋》梦溪石经典之作C-成化十四年｜古风 / 权谋 / 朝堂 / 强强 / 推理（钱文青&刀疤）广播剧第一季完结

May 14, 2026 ·85m

《千秋》梦溪石经典之作C-成化十四年 7～季结｜古风 / 权谋 / 朝堂 / 强强 / 推理（钱文青&刀疤）广播剧第一季完结

May 14, 2026 ·282m

Similar Podcasts

安静低调⚠️不要在国内平台评论区提及本站会被封⚠️《必听推荐》在简介❗️新剧等完结 ♠️♥️ ❗️如果播到一半没声先搜“补档➕剧名”关键词❗️⚠️《不是推剧》的剪辑，听了、但是不推荐此剧。❤️推荐剧集会明确标注🚗除了正剧外，搜“停车场（或emoji 车）”关键词可获得🛏床戏CUT2026 听剧记录📝【💯 “六边形”：剧情➕后期➕配音➕主旨➕电影质感】【按题材分类⬇️】🌟【剧情片，群像、冒险、悬疑、解锁地图…🌍】魂兵之戈、欢迎进入梦魇直播间、地球上线、网易☁️『孙美琪疑案』赵路魏超 8082 Audio制作（打造《黑神话·悟空》《诡秘之主》等作品）魔尊也想知道、🌟【立意佳，精美文艺电影，情绪沉浸，值得细品～🎬】入戏【🐱有声剧】、酒徒、南方海啸、反向驯养（余昊威刘思岑）、男妾生存法则（遇神后期团队）🦊、遇蛇广播剧***、铜钱龛世、***《遇神《二锅水《画外空间《不对付《唇间《人偶…后期老师制作➕《太傅他人人喊打》云耶山耶工作室出品：【遇蛇广播剧】（主役赵毅大昕，配角🈶️顺子、三石、小红…）🌟【立意佳，感动满满、落地生活感 | 生命话题，纪实片质感🎬】地中行、抱抱🐱（刘一鸣 x 斑马）、江医生怀了死对头的崽、回到民国当导演、糊口（徐宇隆 x 胡良伟）、指尖温度、===============以上全肯定/二刷起步/引人思考/超级触动=====================搞笑到抽象👍一口气听完了笑疯了《大家好，我和男二在一起了》（彭尧 x 风允之）【剧情超级精彩👍 一口气听完，后期不错🌟】黑天[木苏里]、限时狩猎（{将进酒}唐酒卿原著）、全球高考、残次品、无限练习生、桐花中路私立协济医院怪谈、暗界神使、C 语言修仙、【剧情精彩👍】分区：【有创意👍都市风水/轻冒险，偏治愈+逗趣】貔貅饭馆只进不出【网易☁️】（小红 x 斑马）、👍装死拯救不了世界（徐宇隆 x World Cup 2026: The 2026 Fan Zone ERIC Get ready for the ultimate World Cup 2026 experience! From epic goals to fan celebrations and behind-the-scenes stories, The 2026 Fan Zone brings you all the action, drama, and global football vibes—wherever you are. Not your keys, not your Bitcoin Bitcoincwb DJ Bitcoincwb Dj 2026 Todesangst Mitteldeutscher Rundfunk Sophie und Moritz auf der Autobahn: Plötzlich bremst Moritz und bleibt mit seinem Sportwagen auf der Überholspur stehen. Sophie bekommt Todesangst. Stopp. Auch Paul hat Angst. Seine Mutter liegt nach einem Unfall im Koma. Plötzlich ist er mit einem bis dato unbekannten Opa konfrontiert. Und nicht nur das. Sophies und Pauls Leben mit allen Desastern verbinden sich …Die Hörspielreihe "Todesangst" stellt Fragen: Wovor kann man Angst haben? Was hält Freundschaft aus? Wieso verliebt man sich überhaupt? Warum haben Menschen Geheimnisse? Woher weiß man, was besser ist: lügen oder die Wahrheit sagen? Wann braucht es zivilen Widerstand?Es spielen: Runa Greiner, Lorenzo Germeno, Valentin Oppermann, Roxana Samadi, Amira Pollmann, Jaecki Schwarz, Mark Zak, Nele Rosetz, Andreas Döhler u.a.Autor: Andreas JungwirthRegie: Judith LorentzMusik: Philipp ThimmProduktion: MDR 2024Online bis 23. Mai 2026

Frequently Asked Questions

How long is this episode of Podkey WWDC 2026?

This episode is 7 minutes long.

When was this Podkey WWDC 2026 episode published?

This episode was published on June 12, 2026.

What is this episode about?

Can I download this Podkey WWDC 2026 episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.

URL copied to clipboard!