Ep. 257 - June 7, 2024
An episode of the TechcraftingAI NLP podcast, hosted by Brad Edwards, titled "Ep. 257 - June 7, 2024" was published on June 10, 2024 and runs 52 minutes.
June 10, 2024 ·52m · TechcraftingAI NLP
Summary
ArXiv NLP research for Friday, June 07, 2024. 00:19: Key-Element-Informed sLLM Tuning for Document Summarization 01:22: Low-Resource Cross-Lingual Summarization through Few-Shot Learning with Large Language Models 02:42: Large Language Model-guided Document Selection 04:13: More Victories, Less Cooperation: Assessing Cicero's Diplomacy Play 05:24: DiNeR: a Large Realistic Dataset for Evaluating Compositional Generalization 06:43: MATTER: Memory-Augmented Transformer Using Heterogeneous Knowledge Sources 08:01: Mixture-of-Agents Enhances Large Language Model Capabilities 09:09: AICoderEval: Improving AI Domain Code Generation of Large Language Models 11:00: CRAG -- Comprehensive RAG Benchmark 13:04: CRiskEval: A Chinese Multi-Level Risk Evaluation Benchmark Dataset for Large Language Models 14:52: Think out Loud: Emotion Deducing Explanation in Dialogues 16:43: WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild 18:46: SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals 19:58: BERTs are Generative In-Context Learners 20:43: Annotating FrameNet via Structure-Conditioned Language Generation 21:49: Revisiting Catastrophic Forgetting in Large Language Model Tuning 22:43: FedLLM-Bench: Realistic Benchmarks for Federated Learning of Large Language Models 24:33: Do Language Models Exhibit Human-like Structural Priming Effects? 25:27: Uncertainty Aware Learning for Language Model Alignment 26:50: The Russian Legislative Corpus 27:24: ComplexTempQA: A Large-Scale Dataset for Complex Temporal Question Answering 28:53: HateDebias: On the Diversity and Variability of Hate Speech Debiasing 30:29: A Deep Dive into the Trade-Offs of Parameter-Efficient Preference Alignment Techniques 32:00: Sexism Detection on a Data Diet 33:18: XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model 34:21: Through the Thicket: A Study of Number-Oriented LLMs derived from Random Forest Models 35:32: LLM-based speaker diarization correction: A generalizable approach 36:52: TCMD: A Traditional Chinese Medicine QA Dataset for Evaluating Large Language Models 38:10: BAMO at SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense 39:10: Quantifying Geospatial in the Common Crawl Corpus 40:14: MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter 41:47: Language models emulate certain cognitive profiles: An investigation of how predictability measures interact with individual differences 43:19: Compositional Generalization with Grounded Language Models 44:26: Scenarios and Approaches for Situated Natural Language Explanations 46:04: Are Large Language Models More Empathetic than Humans? 47:38: SUMIE: A Synthetic Benchmark for Incremental Entity Summarization 48:52: Multi-Head RAG: Solving Multi-Aspect Problems with LLMs 50:33: An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models
Episode Description
ArXiv NLP research for Friday, June 07, 2024.
00:19: Key-Element-Informed sLLM Tuning for Document Summarization
01:22: Low-Resource Cross-Lingual Summarization through Few-Shot Learning with Large Language Models
02:42: Large Language Model-guided Document Selection
04:13: More Victories, Less Cooperation: Assessing Cicero's Diplomacy Play
05:24: DiNeR: a Large Realistic Dataset for Evaluating Compositional Generalization
06:43: MATTER: Memory-Augmented Transformer Using Heterogeneous Knowledge Sources
08:01: Mixture-of-Agents Enhances Large Language Model Capabilities
09:09: AICoderEval: Improving AI Domain Code Generation of Large Language Models
11:00: CRAG -- Comprehensive RAG Benchmark
13:04: CRiskEval: A Chinese Multi-Level Risk Evaluation Benchmark Dataset for Large Language Models
14:52: Think out Loud: Emotion Deducing Explanation in Dialogues
16:43: WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild
18:46: SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals
19:58: BERTs are Generative In-Context Learners
20:43: Annotating FrameNet via Structure-Conditioned Language Generation
21:49: Revisiting Catastrophic Forgetting in Large Language Model Tuning
22:43: FedLLM-Bench: Realistic Benchmarks for Federated Learning of Large Language Models
24:33: Do Language Models Exhibit Human-like Structural Priming Effects?
25:27: Uncertainty Aware Learning for Language Model Alignment
26:50: The Russian Legislative Corpus
27:24: ComplexTempQA: A Large-Scale Dataset for Complex Temporal Question Answering
28:53: HateDebias: On the Diversity and Variability of Hate Speech Debiasing
30:29: A Deep Dive into the Trade-Offs of Parameter-Efficient Preference Alignment Techniques
32:00: Sexism Detection on a Data Diet
33:18: XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model
34:21: Through the Thicket: A Study of Number-Oriented LLMs derived from Random Forest Models
35:32: LLM-based speaker diarization correction: A generalizable approach
36:52: TCMD: A Traditional Chinese Medicine QA Dataset for Evaluating Large Language Models
38:10: BAMO at SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense
39:10: Quantifying Geospatial in the Common Crawl Corpus
40:14: MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter
41:47: Language models emulate certain cognitive profiles: An investigation of how predictability measures interact with individual differences
43:19: Compositional Generalization with Grounded Language Models
44:26: Scenarios and Approaches for Situated Natural Language Explanations
46:04: Are Large Language Models More Empathetic than Humans?
47:38: SUMIE: A Synthetic Benchmark for Incremental Entity Summarization
48:52: Multi-Head RAG: Solving Multi-Aspect Problems with LLMs
50:33: An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models
Similar Episodes
Jun 15, 2024 ·22m
Jun 13, 2024 ·19m
Jun 13, 2024 ·16m
Jun 11, 2024 ·19m
Jun 11, 2024 ·14m
Jun 11, 2024 ·11m