EPISODE · Jun 10, 2024 · 52 MIN
Ep. 257 - June 7, 2024
from TechcraftingAI NLP · host Brad Edwards
ArXiv NLP research for Friday, June 07, 2024. 00:19: Key-Element-Informed sLLM Tuning for Document Summarization 01:22: Low-Resource Cross-Lingual Summarization through Few-Shot Learning with Large Language Models 02:42: Large Language Model-guided Document Selection 04:13: More Victories, Less Cooperation: Assessing Cicero's Diplomacy Play 05:24: DiNeR: a Large Realistic Dataset for Evaluating Compositional Generalization 06:43: MATTER: Memory-Augmented Transformer Using Heterogeneous Knowledge Sources 08:01: Mixture-of-Agents Enhances Large Language Model Capabilities 09:09: AICoderEval: Improving AI Domain Code Generation of Large Language Models 11:00: CRAG -- Comprehensive RAG Benchmark 13:04: CRiskEval: A Chinese Multi-Level Risk Evaluation Benchmark Dataset for Large Language Models 14:52: Think out Loud: Emotion Deducing Explanation in Dialogues 16:43: WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild 18:46: SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals 19:58: BERTs are Generative In-Context Learners 20:43: Annotating FrameNet via Structure-Conditioned Language Generation 21:49: Revisiting Catastrophic Forgetting in Large Language Model Tuning 22:43: FedLLM-Bench: Realistic Benchmarks for Federated Learning of Large Language Models 24:33: Do Language Models Exhibit Human-like Structural Priming Effects? 25:27: Uncertainty Aware Learning for Language Model Alignment 26:50: The Russian Legislative Corpus 27:24: ComplexTempQA: A Large-Scale Dataset for Complex Temporal Question Answering 28:53: HateDebias: On the Diversity and Variability of Hate Speech Debiasing 30:29: A Deep Dive into the Trade-Offs of Parameter-Efficient Preference Alignment Techniques 32:00: Sexism Detection on a Data Diet 33:18: XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model 34:21: Through the Thicket: A Study of Number-Oriented LLMs derived from Random Forest Models 35:32: LLM-based speaker diarization correction: A generalizable approach 36:52: TCMD: A Traditional Chinese Medicine QA Dataset for Evaluating Large Language Models 38:10: BAMO at SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense 39:10: Quantifying Geospatial in the Common Crawl Corpus 40:14: MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter 41:47: Language models emulate certain cognitive profiles: An investigation of how predictability measures interact with individual differences 43:19: Compositional Generalization with Grounded Language Models 44:26: Scenarios and Approaches for Situated Natural Language Explanations 46:04: Are Large Language Models More Empathetic than Humans? 47:38: SUMIE: A Synthetic Benchmark for Incremental Entity Summarization 48:52: Multi-Head RAG: Solving Multi-Aspect Problems with LLMs 50:33: An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models
What this episode covers
ArXiv NLP research for Friday, June 07, 2024. 00:19: Key-Element-Informed sLLM Tuning for Document Summarization 01:22: Low-Resource Cross-Lingual Summarization through Few-Shot Learning with Large Language Models 02:42: Large Language Model-guided Document Selection 04:13: More Victories, Less Cooperation: Assessing Cicero's Diplomacy Play 05:24: DiNeR: a Large Realistic Dataset for Evaluating Compositional Generalization 06:43: MATTER: Memory-Augmented Transformer Using Heterogeneous Knowledge Sources 08:01: Mixture-of-Agents Enhances Large Language Model Capabilities 09:09: AICoderEval: Improving AI Domain Code Generation of Large Language Models 11:00: CRAG -- Comprehensive RAG Benchmark 13:04: CRiskEval: A Chinese Multi-Level Risk Evaluation Benchmark Dataset for Large Language Models 14:52: Think out Loud: Emotion Deducing Explanation in Dialogues 16:43: WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild 18:46: SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals 19:58: BERTs are Generative In-Context Learners 20:43: Annotating FrameNet via Structure-Conditioned Language Generation 21:49: Revisiting Catastrophic Forgetting in Large Language Model Tuning 22:43: FedLLM-Bench: Realistic Benchmarks for Federated Learning of Large Language Models 24:33: Do Language Models Exhibit Human-like Structural Priming Effects? 25:27: Uncertainty Aware Learning for Language Model Alignment 26:50: The Russian Legislative Corpus 27:24: ComplexTempQA: A Large-Scale Dataset for Complex Temporal Question Answering 28:53: HateDebias: On the Diversity and Variability of Hate Speech Debiasing 30:29: A Deep Dive into the Trade-Offs of Parameter-Efficient Preference Alignment Techniques 32:00: Sexism Detection on a Data Diet 33:18: XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model 34:21: Through the Thicket: A Study of Number-Oriented LLMs derived from Random Forest Models 35:32: LLM-based speaker diarization correction: A generalizable approach 36:52: TCMD: A Traditional Chinese Medicine QA Dataset for Evaluating Large Language Models 38:10: BAMO at SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense 39:10: Quantifying Geospatial in the Common Crawl Corpus 40:14: MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter 41:47: Language models emulate certain cognitive profiles: An investigation of how predictability measures interact with individual differences 43:19: Compositional Generalization with Grounded Language Models 44:26: Scenarios and Approaches for Situated Natural Language Explanations 46:04: Are Large Language Models More Empathetic than Humans? 47:38: SUMIE: A Synthetic Benchmark for Incremental Entity Summarization 48:52: Multi-Head RAG: Solving Multi-Aspect Problems with LLMs 50:33: An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models
NOW PLAYING
Ep. 257 - June 7, 2024
No transcript for this episode yet
Similar Episodes
May 1, 2026 ·74m
Apr 22, 2026 ·7m
Feb 4, 2026 ·60m