Ep. 256 - Part 1 - June 6, 2024
An episode of the TechcraftingAI NLP podcast, hosted by Brad Edwards, titled "Ep. 256 - Part 1 - June 6, 2024" was published on June 7, 2024 and runs 35 minutes.
June 7, 2024 ·35m · TechcraftingAI NLP
Summary
ArXiv NLP research for Thursday, June 06, 2024. 00:20: Efficient Knowledge Infusion via KG-LLM Alignment 01:25: NAP^2: A Benchmark for Naturalness and Privacy-Preserving Text Rewriting by Learning from Human 02:34: Character-Level Chinese Dependency Parsing via Modeling Latent Intra-Word Structure 03:30: XL-HeadTags: Leveraging Multimodal Retrieval Augmentation for the Multilingual Generation of News Headlines and Tags 04:59: End-to-End Trainable Soft Retriever for Low-resource Relation Extraction 06:07: Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning 07:37: Improving Zero-Shot Chinese-English Code-Switching ASR with kNN-CTC and Gated Monolingual Datastores 08:52: ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search 10:29: Chaos with Keywords: Exposing Large Language Models Sycophancy to Misleading Keywords and Evaluating Defense Strategies 11:39: Lean Workbook: A large-scale Lean problem set formalized from natural language math problems 12:56: Speculative Decoding via Early-exiting for Faster LLM Inference with Thompson Sampling Control Mechanism 14:18: Performance of large language models in numerical vs. semantic medical knowledge: Benchmarking on evidence-based Q&As 16:24: Recovering document annotations for sentence-level bitext 17:40: BLSP-Emo: Towards Empathetic Large Speech-Language Models 19:01: Decoder-only Streaming Transformer for Simultaneous Translation 20:28: Evaluating the IWSLT2023 Speech Translation Tasks: Human Annotations, Automatic Metrics, and Segmentation 21:53: Spontaneous Speech-Based Suicide Risk Detection Using Whisper and Large Language Models 23:06: How Good is Zero-Shot MT Evaluation for Low Resource Indian Languages? 24:13: HeSum: a Novel Dataset for Abstractive Text Summarization in Hebrew 25:19: ArMeme: Propagandistic Content in Arabic Memes 26:26: Culturally Aware and Adapted NLP: A Taxonomy and a Survey of the State of the Art 27:11: UltraMedical: Building Specialized Generalists in Biomedicine 28:43: Tox-BART: Leveraging Toxicity Attributes for Explanation Generation of Implicit Hate Speech 30:02: A + B: A General Generator-Reader Framework for Optimizing LLMs to Unleash Synergy Potential 31:29: On The Persona-based Summarization of Domain-Specific Documents 33:14: Assessing LLMs for Zero-shot Abstractive Summarization Through the Lens of Relevance Paraphrasing 34:28: American Sign Language Handshapes Reflect Pressures for Communicative Efficiency
Episode Description
ArXiv NLP research for Thursday, June 06, 2024.
00:20: Efficient Knowledge Infusion via KG-LLM Alignment
01:25: NAP^2: A Benchmark for Naturalness and Privacy-Preserving Text Rewriting by Learning from Human
02:34: Character-Level Chinese Dependency Parsing via Modeling Latent Intra-Word Structure
03:30: XL-HeadTags: Leveraging Multimodal Retrieval Augmentation for the Multilingual Generation of News Headlines and Tags
04:59: End-to-End Trainable Soft Retriever for Low-resource Relation Extraction
06:07: Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning
07:37: Improving Zero-Shot Chinese-English Code-Switching ASR with kNN-CTC and Gated Monolingual Datastores
08:52: ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search
10:29: Chaos with Keywords: Exposing Large Language Models Sycophancy to Misleading Keywords and Evaluating Defense Strategies
11:39: Lean Workbook: A large-scale Lean problem set formalized from natural language math problems
12:56: Speculative Decoding via Early-exiting for Faster LLM Inference with Thompson Sampling Control Mechanism
14:18: Performance of large language models in numerical vs. semantic medical knowledge: Benchmarking on evidence-based Q&As
16:24: Recovering document annotations for sentence-level bitext
17:40: BLSP-Emo: Towards Empathetic Large Speech-Language Models
19:01: Decoder-only Streaming Transformer for Simultaneous Translation
20:28: Evaluating the IWSLT2023 Speech Translation Tasks: Human Annotations, Automatic Metrics, and Segmentation
21:53: Spontaneous Speech-Based Suicide Risk Detection Using Whisper and Large Language Models
23:06: How Good is Zero-Shot MT Evaluation for Low Resource Indian Languages?
24:13: HeSum: a Novel Dataset for Abstractive Text Summarization in Hebrew
25:19: ArMeme: Propagandistic Content in Arabic Memes
26:26: Culturally Aware and Adapted NLP: A Taxonomy and a Survey of the State of the Art
27:11: UltraMedical: Building Specialized Generalists in Biomedicine
28:43: Tox-BART: Leveraging Toxicity Attributes for Explanation Generation of Implicit Hate Speech
30:02: A + B: A General Generator-Reader Framework for Optimizing LLMs to Unleash Synergy Potential
31:29: On The Persona-based Summarization of Domain-Specific Documents
33:14: Assessing LLMs for Zero-shot Abstractive Summarization Through the Lens of Relevance Paraphrasing
34:28: American Sign Language Handshapes Reflect Pressures for Communicative Efficiency
Similar Episodes
Jun 15, 2024 ·22m
Jun 13, 2024 ·19m
Jun 13, 2024 ·16m
Jun 11, 2024 ·19m
Jun 11, 2024 ·14m
Jun 11, 2024 ·11m