TechcraftingAI NLP Podcast - All Episodes

271

Ep. 263 - Part 2 - June 13, 2024

ArXiv NLP research for Thursday, June 13, 2024. 00:20: Chain-of-Though (CoT) prompting strategies for medical error detection and correction 01:31: CoastTerm: a Corpus for Multidisciplinary Term Extraction in Coastal Scientific Literature 02:52: RH-SQL: Refined Schema and Hardness Prompt for Text-to-SQL 04:01: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs 05:24: Leveraging Explicit Reasoning for Inference Integration in Commonsense-Augmented Dialogue Models 06:38: Investigating the translation capabilities of Large Language Models trained on parallel data only 07:56: LASER: Learning by Aligning Self-supervised Representations of Speech for Improving Content-related Tasks 09:09: DefAn: Definitive Answer Dataset for LLMs Hallucination Evaluation 11:20: Test of Time: A Benchmark for Evaluating LLMs on Temporal Reasoning 12:46: Orthogonality and isotropy of speaker and phonetic information in self-supervised speech representations 13:53: Language Complexity and Speech Recognition Accuracy: Orthographic Complexity Hurts, Phonological Complexity Doesn't 14:47: ReadCtrl: Personalizing text generation with readability-controlled instruction learning 16:32: Self-Training for Sample-Efficient Active Learning for Text Classification with Pre-Trained Language Models 17:49: Sharing Matters: Analysing Neurons Across Languages and Tasks in LLMs 19:18: End-to-end Streaming model for Low-Latency Speech Anonymization 20:22: Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback 22:25: On the Effects of Heterogeneous Data Sources on Speech-to-Text Foundation Models 23:33: Understanding Jailbreak Success: A Study of Latent Space Dynamics in Large Language Models 24:35: Exploring Spoken Language Identification Strategies for Automatic Transcription of Multilingual Broadcast and Institutional Speech 25:47: AlignMMBench: Evaluating Chinese Multimodal Alignment in Large Vision-Language Models 27:15: Transformers meet Neural Algorithmic Reasoners 28:32: REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space 30:02: Learning from Natural Language Explanations for Generalizable Entity Matching 31:14: ProxyLM: Predicting Language Model Performance on Multilingual Tasks via Proxy Models 32:29: DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding 33:43: Improving Autoregressive Training with Dynamic Oracles

Jun 15, 2024

34m

270

Ep. 263 - Part 1 - June 13, 2024

ArXiv NLP research for Thursday, June 13, 2024. 00:20: Deep Exploration of Cross-Lingual Zero-Shot Generalization in Instruction Tuning 01:53: Mixture-of-Skills: Learning to Optimize Data Usage for Fine-Tuning Large Language Models 03:26: Automated Essay Scoring Using Grammatical Variety and Errors with Multi-Task Learning and Item Response Theory 04:33: Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination 06:05: DisfluencySpeech -- Single-Speaker Conversational Speech Dataset with Paralanguage 07:26: Research on Optimization of Natural Language Processing Model Based on Multimodal Deep Learning 08:41: ContraSolver: Self-Alignment of Language Models by Resolving Internal Preference Contradictions 10:07: An Approach to Build Zero-Shot Slot-Filling System for Industry-Grade Conversational Assistants 11:42: Plan, Generate and Complicate: Improving Low-resource Dialogue State Tracking via Easy-to-Difficult Zero-shot Data Augmentation 12:42: No perspective, no perception!! Perspective-aware Healthcare Answer Summarization 14:28: Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for Large Language Models 16:02: An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios 17:21: Navigating the Shadows: Unveiling Effective Disturbances for Modern AI Content Detectors 18:48: Exploring Multilingual Unseen Speaker Emotion Recognition: Leveraging Co-Attention Cues in Multitask Learning 19:52: Word Order in English-Japanese Simultaneous Interpretation: Analyses and Evaluation using Chunk-wise Monotonic Translation 21:12: Multi-Agent Software Development through Cross-Team Collaboration 22:55: LLM Reading Tea Leaves: Automatically Evaluating Topic Models with Large Language Models 24:14: Bayesian Statistical Modeling with Predictors from LLMs 25:39: ME-Switch: A Memory-Efficient Expert Switching Framework for Large Language Models 27:28: Language Models are Crossword Solvers 28:32: MiLoRA: Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning 29:51: CUDRT: Benchmarking the Detection of Human vs. Large Language Models Generated Texts 31:29: Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning? 32:59: 3M: Multi-modal Multi-task Multi-teacher Learning for Game Event Detection 34:08: Modeling Comparative Logical Relation with Contrastive Learning for Text Generation 35:42: SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models

Jun 15, 2024

37m

269

Ep. 262 - June 12, 2024

ArXiv NLP research for Wednesday, June 12, 2024. 00:19: VALL-E R: Robust and Efficient Zero-Shot Text-to-Speech Synthesis via Monotonic Alignment 02:05: BookSQL: A Large Scale Text-to-SQL Dataset for Accounting Domain 03:15: Designing a Dashboard for Transparency and Control of Conversational AI 04:46: Label-aware Hard Negative Sampling Strategies with Momentum Contrastive Learning for Implicit Hate Speech Detection 05:51: Exploring Speech Foundation Models for Speaker Diarization in Child-Adult Dyadic Interactions 06:53: Exploring Self-Supervised Multi-view Contrastive Learning for Speech Emotion Recognition with Limited Annotations 07:52: Guiding Frame-Level CTC Alignments Using Self-knowledge Distillation 08:55: DeTriever: Decoder-representation-based Retriever for Improving NL2SQL In-Context Learning 10:20: Automated Information Extraction from Thyroid Operation Narrative: A Comparative Study of GPT-4 and Fine-tuned KoELECTRA 11:35: Large Language Model Unlearning via Embedding-Corrupted Prompts 13:17: Defining and Detecting Vulnerability in Human Evaluation Guidelines: A Preliminary Study Towards Reliable NLG Evaluation 14:46: Better than Random: Reliable NLG Human Evaluation with Constrained Active Sampling 16:02: LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning 17:18: Guiding In-Context Learning of LLMs through Quality Estimation for Machine Translation 18:37: It Takes Two: On the Seamlessness between Reward and Policy Model in RLHF 20:02: Adversarial Evasion Attack Efficiency against Large Language Models 21:06: Learning Job Title Representation from Job Description Aggregation Network 21:59: Large Language Models Meet Text-Centric Multimodal Sentiment Analysis: A Survey 23:35: AustroTox: A Dataset for Target-Based Austrian German Offensive Language Detection 24:38: Languages Transferred Within the Encoder: On Representation Transfer in Zero-Shot Multilingual Translation 25:56: Multimodal Table Understanding 27:20: CoXQL: A Dataset for Parsing Explanation Requests in Conversational XAI Systems 28:51: Supportiveness-based Knowledge Rewriting for Retrieval-augmented Language Modeling 30:36: Legend: Leveraging Representation Engineering to Annotate Safety Margin for Preference Datasets 31:57: Semi-Supervised Spoken Language Glossification 33:16: Underneath the Numbers: Quantitative and Qualitative Gender Fairness in LLMs for Depression Prediction 34:37: A Dialogue Game for Eliciting Balanced Collaboration 35:23: Transformer-based Model for ASR N-Best Rescoring and Rewriting 36:16: SumHiS: Extractive Summarization Exploiting Hidden Structure 36:53: Figuratively Speaking: Authorship Attribution via Multi-Task Figurative Language Modeling 38:08: Leveraging Large Language Models for Web Scraping 39:51: M3T: A New Benchmark Dataset for Multi-Modal Document-Level Machine Translation 41:15: Is Programming by Example solved by LLMs? 42:29: Speech Emotion Recognition with ASR Transcripts: A Comprehensive Study on Word Error Rate and Fusion Techniques 43:42: Towards Unsupervised Speech Recognition Without Pronunciation Models 44:50: cPAPERS: A Dataset of Situated and Multimodal Interactive Conversations in Scientific Papers 45:57: Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models 47:02: Tailoring Generative AI Chatbots for Multiethnic Communities in Disaster Preparedness Communication: Extending the CASA Paradigm 48:12: Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL 49:56: TasTe: Teaching Large Language Models to Translate through Self-Reflection 51:28: OLMES: A Standard for Language Model Evaluations 52:47: Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

Jun 13, 2024

54m

268

Ep. 261 - Part 2 - June 11, 2024

ArXiv NLP research for Tuesday, June 11, 2024. 00:20: Scientific Computing with Large Language Models 01:08: Speaking Your Language: Spatial Relationships in Interpretable Emergent Communication 02:19: Bilingual Sexism Classification: Fine-Tuned XLM-RoBERTa and GPT-3.5 Few-Shot Learning 03:51: Fine-tuning with HED-IT: The impact of human post-editing for dialogical language models 05:26: Can We Achieve High-quality Direct Speech-to-Speech Translation without Parallel Speech Data? 07:03: Joint Learning of Context and Feedback Embeddings in Spoken Dialogue 07:57: BertaQA: How Much Do Language Models Know About Local Culture? 09:17: MM-KWS: Multi-modal Prompts for Multilingual User-defined Keyword Spotting 10:20: CTC-based Non-autoregressive Textless Speech-to-Speech Translation 11:21: Toxic Memes: A Survey of Computational Perspectives on the Detection and Explanation of Meme Toxicities 13:27: GLIMPSE: Pragmatically Informative Multi-Document Summarization for Scholarly Reviews 14:40: BvSP: Broad-view Soft Prompting for Few-Shot Aspect Sentiment Quad Prediction 16:32: When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models 18:01: Limited Out-of-Context Knowledge Reasoning in Large Language Models 19:36: MINERS: Multilingual Language Models as Semantic Retrievers 20:42: Learning Domain-Invariant Features for Out-of-Context News Detection 22:03: Textual Similarity as a Key Metric in Machine Translation Quality Estimation 23:02: On the Robustness of Document-Level Relation Extraction Models to Entity Name Variations 24:31: Multimodal Belief Prediction 25:29: Advancing Annotation of Stance in Social Media Posts: A Comparative Analysis of Large Language Models and Crowd Sourcing 26:56: Paraphrasing in Affirmative Terms Improves Negation Understanding 27:37: CADS: A Systematic Literature Review on the Challenges of Abstractive Dialogue Summarization 29:38: TextGrad: Automatic "Differentiation" via Text 31:35: Just Because We Camp, Doesn't Mean We Should: The Ethics of Modelling Queer Voices 32:35: THaLLE: Text Hyperlocally Augmented Large Language Extension -- Technical Report 33:51: Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling 35:22: Simple and Effective Masked Diffusion Language Models 36:35: Open-LLM-Leaderboard: From Multi-choice to Open-style Questions for LLMs Evaluation, Benchmark, and Arena

Jun 13, 2024

38m

267

Ep. 261 - Part 1 - June 11, 2024

ArXiv NLP research for Tuesday, June 11, 2024. 00:20: A Non-autoregressive Generation Framework for End-to-End Simultaneous Speech-to-Any Translation 01:41: Post-Hoc Answer Attribution for Grounded and Trustworthy Long Document Comprehension: Task, Insights, and Challenges 02:32: A Probabilistic Framework for LLM Hallucination Detection via Belief Tree Propagation 04:08: Evolving Subnetwork Training for Large Language Models 05:31: Missingness-resilient Video-enhanced Multimodal Disfluency Detection 06:37: Mitigating Boundary Ambiguity and Inherent Bias for Text Classification in the Era of Large Language Models 08:14: Crayon: Customized On-Device LLM via Instant Adapter Blending and Edge-Server Hybrid Inference 09:33: Delving into ChatGPT usage in academic writing through excess vocabulary 10:53: Paying More Attention to Source Context: Mitigating Unfaithful Translations from Large Language Model 12:12: CoEvol: Constructing Better Responses for Instruction Finetuning through Multi-Agent Cooperation 13:26: Effectively Compress KV Heads for LLM 15:00: Benchmarking Trustworthiness of Multimodal Large Language Models: A Comprehensive Study 16:54: Reading Miscue Detection in Primary School through Automatic Speech Recognition 18:09: HalluDial: A Large-Scale Benchmark for Automatic Dialogue-Level Hallucination Evaluation 20:01: DARA: Decomposition-Alignment-Reasoning Autonomous Language Agent for Question Answering over Knowledge Graphs 21:15: Efficiently Exploring Large Language Models for Document-Level Machine Translation with In-context Learning 22:35: Advancing Tool-Augmented Large Language Models: Integrating Insights from Errors in Inference Trees 24:42: Translating speech with just images 25:35: Never Miss A Beat: An Efficient Recipe for Context Window Extension of Large Language Models with Consistent "Middle" Enhancement 26:51: Teaching Language Models to Self-Improve by Learning from Language Feedback 28:25: Merging Improves Self-Critique Against Jailbreak Attacks 29:18: Towards Human-AI Collaboration in Healthcare: Guided Deferral Systems with Large Language Models 30:11: Improving Autoformalization using Type Checking 31:37: Improving Commonsense Bias Classification by Mitigating the Influence of Demographic Terms 33:19: Decipherment-Aware Multilingual Learning in Jointly Trained Language Models 34:20: DUAL-REFLECT: Enhancing Large Language Models for Reflective Translation through Dual Learning Feedback Mechanisms 35:20: On the Hallucination in Simultaneous Machine Translation 36:07: MBBQ: A Dataset for Cross-Lingual Comparison of Stereotypes in Generative LLMs 37:42: Scholarly Question Answering using Large Language Models in the NFDI4DataScience Gateway

Jun 13, 2024

38m

266

Ep. 260 - June 10, 2024

ArXiv NLP research for Monday, June 10, 2024. 00:19: Shoulders of Giants: A Look at the Degree and Utility of Openness in NLP Research 00:59: HOLMES: Hyper-Relational Knowledge Graphs for Multi-hop Question Answering using LLMs 02:29: The Curse of Popularity: Popular Entities have Catastrophic Side Effects when Deleting Knowledge from Language Models 03:24: MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models 04:51: A Multidimensional Framework for Evaluating Lexical Semantic Change with Social Science Applications 05:49: Synth-SBDH: A Synthetic Dataset of Social and Behavioral Determinants of Health for Clinical Text 07:10: Efficient k-Nearest-Neighbor Machine Translation with Dynamic Retrieval 09:08: Recurrent Context Compression: Efficiently Expanding the Context Window of LLM 10:35: Enhancing Long-Term Memory using Hierarchical Aggregate Tree for Retrieval Augmented Generation 11:26: Verifiable Generation with Subsentence-Level Fine-Grained Citations 12:36: Comparing Data Augmentation Methods for End-to-End Task-Oriented Dialog Systems 13:55: Building Bridges: A Dataset for Evaluating Gender-Fair Machine Translation into German 15:28: Can I understand what I create? Self-Knowledge Evaluation of Large Language Models 16:28: Language Models Resist Alignment 17:58: LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages 19:27: Learning Fine-Grained Controllability on Speech Generation via Efficient Fine-Tuning 20:27: Combining Embeddings and Domain Knowledge for Job Posting Duplicate Detection 21:37: MaskLID: Code-Switching Language Identification through Iterative Masking 22:49: Multi-Prompting Decoder Helps Better Language Understanding 24:22: Tx-LLM: A Large Language Model for Therapeutics 26:21: Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching 27:43: A Parameter-efficient Language Extension Framework for Multilingual ASR 29:06: MedExQA: Medical Question Answering Benchmark with Multiple Explanations 30:36: Sustained Vowels for Pre- vs Post-Treatment COPD Classification 31:49: MASSW: A New Dataset and Benchmark Tasks for AI-Assisted Scientific Workflows 33:40: Symmetric Dot-Product Attention for Efficient Training of BERT Language Models 35:00: Annotation alignment: Comparing LLM and human annotations of conversational safety 36:07: mHuBERT-147: A Compact Multilingual HuBERT Model 37:27: Should We Fine-Tune or RAG? Evaluating Different Techniques to Adapt LLMs for Dialogue 39:00: INTERSPEECH 2009 Emotion Challenge Revisited: Benchmarking 15 Years of Progress in Speech Emotion Recognition 40:06: Meta Learning Text-to-Speech Synthesis in over 7000 Languages 40:59: Controlling Emotion in Text-to-Speech with Natural Language Prompts 41:55: Language Models are Alignable Decision-Makers: Dataset and Application to the Medical Triage Domain 43:29: Multimodal Contextualized Semantic Parsing from Speech 44:25: Interpretability of Language Models via Task Spaces 45:45: Evaluating the Retrieval Component in LLM-Based Question Answering Systems 46:52: Reasoning in Token Economies: Budget-Aware Evaluation of LLM Reasoning Strategies 48:08: Can Language Models Serve as Text-Based World Simulators?

Jun 11, 2024

49m

265

Ep. 259 - June 9, 2024

ArXiv NLP research for Sunday, June 09, 2024. 00:19: How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States 01:40: DomainRAG: A Chinese Benchmark for Evaluating Domain-specific Retrieval-Augmented Generation 03:25: Do LLMs Exhibit Human-Like Reasoning? Evaluating Theory of Mind in LLMs for Open-Ended Responses 05:08: MS-HuBERT: Mitigating Pre-training and Inference Mismatch in Masked Language Modelling methods for learning Speech Representations 06:17: SinkLoRA: Enhanced Efficiency and Chat Capabilities for Long-Context Large Language Models 08:11: Peer Review as A Multi-Turn and Long-Context Dialogue with Role-Based Interactions 09:54: MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story Generation 11:20: QGEval: A Benchmark for Question Generation Evaluation 12:44: MrRank: Improving Question Answering Retrieval System through Multi-Result Ranking Model 13:43: Arabic Diacritics in the Wild: Exploiting Opportunities for Improved Diacritization 14:46: The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models 16:30: RE-RAG: Improving Open-Domain QA Performance and Interpretability with Relevance Estimator in Retrieval-Augmented Generation 18:14: Hidden Holes: topological aspects of language models 19:46: Do Prompts Really Prompt? Exploring the Prompt Understanding Capability of Whisper 20:40: Seventeenth-Century Spanish American Notary Records for Fine-Tuning Spanish Large Language Models 22:02: MedREQAL: Examining Medical Knowledge Recall of Large Language Models via Question Answering 23:12: II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Models 25:17: Zero-Shot End-To-End Spoken Question Answering In Medical Domain 26:27: Are Large Language Models Actually Good at Text Style Transfer? 27:32: Feriji: A French-Zarma Parallel Corpus, Glossary & Translator 28:56: TTM-RE: Memory-Augmented Document-Level Relation Extraction 30:12: Why Don't Prompt-Based Fairness Metrics Correlate? 31:27: Hello Again! LLM-powered Personalized Agent for Long-term Dialogue 33:12: Semisupervised Neural Proto-Language Reconstruction 34:12: Prompting Large Language Models with Audio for General-Purpose Speech Summarization 35:14: A Dual-View Approach to Classifying Radiology Reports by Co-Training 36:07: ThaiCoref: Thai Coreference Resolution Dataset

Jun 11, 2024

37m

264

Ep. 258 - June 8, 2024

ArXiv NLP research for Saturday, June 08, 2024. 00:19: MemeGuard: An LLM and VLM-based Framework for Advancing Content Moderation via Meme Intervention 01:44: Toward Reliable Ad-hoc Scientific Information Extraction: A Case Study on Two Materials Datasets 02:30: Flexible and Adaptable Summarization via Expertise Separation 04:18: Write Summary Step-by-Step: A Pilot Study of Stepwise Summarization 06:07: CaLM: Contrasting Large and Small Language Models to Verify Grounded Generation 07:23: Venn Diagram Prompting : Accelerating Comprehension with Scaffolding Effect 08:45: VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers 10:19: Planning Like Human: A Dual-process Framework for Dialogue Planning 11:48: Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas 12:57: Recent advancements in computational morphology : A comprehensive survey 14:01: MaTableGPT: GPT-based Table Data Extractor from Materials Science Literature 15:41: Design of reliable technology valuation model with calibrated machine learning of patent indicators 17:08: Fighting Against the Repetitive Training and Sample Dependency Problem in Few-shot Named Entity Recognition 18:59: Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation 20:25: Generalist Multimodal AI: A Review of Architectures, Challenges and Opportunities 21:47: ThatiAR: Subjectivity Detection in Arabic News Sentences 23:07: Do LLMs Recognize me, When I is not me: Assessment of LLMs Understanding of Turkish Indexical Pronouns in Indexical Shift Contexts 24:49: Creativity Has Left the Chat: The Price of Debiasing Language Models 25:57: CERET: Cost-Effective Extrinsic Refinement for Text Generation 27:05: GrowOVER: How Can LLMs Adapt to Growing Real-World Knowledge? 28:07: Video-Language Understanding: A Survey from Model Architecture, Model Training, and Data Perspectives 29:03: ATLAS: Improving Lay Summarisation with Attribute-based Control

Jun 11, 2024

30m

263

Ep. 257 - June 7, 2024

ArXiv NLP research for Friday, June 07, 2024. 00:19: Key-Element-Informed sLLM Tuning for Document Summarization 01:22: Low-Resource Cross-Lingual Summarization through Few-Shot Learning with Large Language Models 02:42: Large Language Model-guided Document Selection 04:13: More Victories, Less Cooperation: Assessing Cicero's Diplomacy Play 05:24: DiNeR: a Large Realistic Dataset for Evaluating Compositional Generalization 06:43: MATTER: Memory-Augmented Transformer Using Heterogeneous Knowledge Sources 08:01: Mixture-of-Agents Enhances Large Language Model Capabilities 09:09: AICoderEval: Improving AI Domain Code Generation of Large Language Models 11:00: CRAG -- Comprehensive RAG Benchmark 13:04: CRiskEval: A Chinese Multi-Level Risk Evaluation Benchmark Dataset for Large Language Models 14:52: Think out Loud: Emotion Deducing Explanation in Dialogues 16:43: WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild 18:46: SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals 19:58: BERTs are Generative In-Context Learners 20:43: Annotating FrameNet via Structure-Conditioned Language Generation 21:49: Revisiting Catastrophic Forgetting in Large Language Model Tuning 22:43: FedLLM-Bench: Realistic Benchmarks for Federated Learning of Large Language Models 24:33: Do Language Models Exhibit Human-like Structural Priming Effects? 25:27: Uncertainty Aware Learning for Language Model Alignment 26:50: The Russian Legislative Corpus 27:24: ComplexTempQA: A Large-Scale Dataset for Complex Temporal Question Answering 28:53: HateDebias: On the Diversity and Variability of Hate Speech Debiasing 30:29: A Deep Dive into the Trade-Offs of Parameter-Efficient Preference Alignment Techniques 32:00: Sexism Detection on a Data Diet 33:18: XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model 34:21: Through the Thicket: A Study of Number-Oriented LLMs derived from Random Forest Models 35:32: LLM-based speaker diarization correction: A generalizable approach 36:52: TCMD: A Traditional Chinese Medicine QA Dataset for Evaluating Large Language Models 38:10: BAMO at SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense 39:10: Quantifying Geospatial in the Common Crawl Corpus 40:14: MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter 41:47: Language models emulate certain cognitive profiles: An investigation of how predictability measures interact with individual differences 43:19: Compositional Generalization with Grounded Language Models 44:26: Scenarios and Approaches for Situated Natural Language Explanations 46:04: Are Large Language Models More Empathetic than Humans? 47:38: SUMIE: A Synthetic Benchmark for Incremental Entity Summarization 48:52: Multi-Head RAG: Solving Multi-Aspect Problems with LLMs 50:33: An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models

Jun 10, 2024

52m

262

Ep. 256 - Part 2 - June 6, 2024

ArXiv NLP research for Thursday, June 06, 2024. 00:20: The syntax-semantics interface in a child's path: A study of 3- to 11-year-olds' elicited production of Mandarin recursive relative clauses 02:17: Ask LLMs Directly, "What shapes your bias?": Measuring Social Bias in Large Language Models 03:39: Explainability and Hate Speech: Structured Explanations Make Social Media Moderators Faster 04:36: Intention and Face in Dialog 05:48: Uncovering Limitations of Large Language Models in Information Seeking from Tables 07:15: Are We Done with MMLU? 08:41: Legal Judgment Reimagined: PredEx and the Rise of Intelligent AI Interpretation in Indian Courts 09:53: Do Language Models Understand Morality? Towards a Robust Detection of Moral Content 11:47: Every Answer Matters: Evaluating Commonsense with Probabilistic Measures 12:49: Towards Understanding Task-agnostic Debiasing Through the Lenses of Intrinsic Bias and Forgetfulness 14:26: Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness 15:35: Confabulation: The Surprising Value of Large Language Model Hallucinations 16:42: DICE: Detecting In-distribution Contamination in LLM's Fine-tuning Phase for Math Reasoning 18:25: Legal Documents Drafting with Fine-Tuned Pre-Trained Large Language Model 19:32: ValueBench: Towards Comprehensively Evaluating Value Orientations and Understanding of Large Language Models 20:50: mCSQA: Multilingual Commonsense Reasoning Dataset with Unified Creation Strategy by Language Models and Humans 22:21: What Do Language Models Learn in Context? The Structured Task Hypothesis 23:38: Rethinking LLM and Linguistic Steganalysis: An Efficient Detection of Strongly Concealed Stego 24:58: BEADs: Bias Evaluation Across Domains 26:41: FairytaleQA Translated: Enabling Educational Question and Answer Generation in Less-Resourced Languages 28:03: Benchmark Data Contamination of Large Language Models: A Survey 29:02: Transformers need glasses! Information over-squashing in language tasks 30:26: Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models 31:58: Characterizing Similarities and Divergences in Conversational Tones in Humans and LLMs by Sampling with People 33:44: ABEX: Data Augmentation for Low-Resource NLU via Expanding Abstract Descriptions 35:19: What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages 36:41: PaCE: Parsimonious Concept Engineering for Large Language Models

Jun 7, 2024

38m

261

Ep. 256 - Part 1 - June 6, 2024

ArXiv NLP research for Thursday, June 06, 2024. 00:20: Efficient Knowledge Infusion via KG-LLM Alignment 01:25: NAP^2: A Benchmark for Naturalness and Privacy-Preserving Text Rewriting by Learning from Human 02:34: Character-Level Chinese Dependency Parsing via Modeling Latent Intra-Word Structure 03:30: XL-HeadTags: Leveraging Multimodal Retrieval Augmentation for the Multilingual Generation of News Headlines and Tags 04:59: End-to-End Trainable Soft Retriever for Low-resource Relation Extraction 06:07: Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning 07:37: Improving Zero-Shot Chinese-English Code-Switching ASR with kNN-CTC and Gated Monolingual Datastores 08:52: ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search 10:29: Chaos with Keywords: Exposing Large Language Models Sycophancy to Misleading Keywords and Evaluating Defense Strategies 11:39: Lean Workbook: A large-scale Lean problem set formalized from natural language math problems 12:56: Speculative Decoding via Early-exiting for Faster LLM Inference with Thompson Sampling Control Mechanism 14:18: Performance of large language models in numerical vs. semantic medical knowledge: Benchmarking on evidence-based Q&As 16:24: Recovering document annotations for sentence-level bitext 17:40: BLSP-Emo: Towards Empathetic Large Speech-Language Models 19:01: Decoder-only Streaming Transformer for Simultaneous Translation 20:28: Evaluating the IWSLT2023 Speech Translation Tasks: Human Annotations, Automatic Metrics, and Segmentation 21:53: Spontaneous Speech-Based Suicide Risk Detection Using Whisper and Large Language Models 23:06: How Good is Zero-Shot MT Evaluation for Low Resource Indian Languages? 24:13: HeSum: a Novel Dataset for Abstractive Text Summarization in Hebrew 25:19: ArMeme: Propagandistic Content in Arabic Memes 26:26: Culturally Aware and Adapted NLP: A Taxonomy and a Survey of the State of the Art 27:11: UltraMedical: Building Specialized Generalists in Biomedicine 28:43: Tox-BART: Leveraging Toxicity Attributes for Explanation Generation of Implicit Hate Speech 30:02: A + B: A General Generator-Reader Framework for Optimizing LLMs to Unleash Synergy Potential 31:29: On The Persona-based Summarization of Domain-Specific Documents 33:14: Assessing LLMs for Zero-shot Abstractive Summarization Through the Lens of Relevance Paraphrasing 34:28: American Sign Language Handshapes Reflect Pressures for Communicative Efficiency

Jun 7, 2024

35m

260

Ep. 255 - June 5, 2024

ArXiv NLP research for Wednesday, June 05, 2024. 00:19: Improving In-Context Learning with Prediction Feedback for Sentiment Analysis 01:24: MultifacetEval: Multifaceted Evaluation to Probe LLMs in Mastering Medical Knowledge 03:01: Text Injection for Neural Contextual Biasing 04:16: 4D ASR: Joint Beam Search Integrating CTC, Attention, Transducer, and Mask Predict Decoders 06:03: Adversarial Moment-Matching Distillation of Large Language Models 07:05: Docs2KG: Unified Knowledge Graph Construction from Heterogeneous Documents Assisted by Large Language Models 08:48: Readability-guided Idiom-aware Sentence Simplification (RISS) for Chinese 09:56: Evaluation of data inconsistency for multi-modal sentiment analysis 10:55: BadAgent: Inserting and Activating Backdoor Attacks in LLM Agents 12:11: Unveiling Selection Biases: Exploring Order and Token Sensitivity in Large Language Models 13:16: From Tarzan to Tolkien: Controlling the Language Proficiency Level of LLMs for Content Generation 14:20: StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task Learning 15:42: RadBARTsum: Domain Specific Adaption of Denoising Sequence-to-Sequence Models for Abstractive Radiology Report Summarization 17:00: Towards Detecting LLMs Hallucination via Markov Chain-based Multi-agent Debate Framework 18:14: Cryptocurrency Frauds for Dummies: How ChatGPT introduces us to fraud? 19:48: FragRel: Exploiting Fragment-level Relations in the External Memory of Large Language Models 20:59: Space Decomposition for Sentence Embedding 22:00: Towards Real-world Scenario: Imbalanced New Intent Discovery 23:40: Which Side Are You On? A Multi-task Dataset for End-to-End Argument Summarisation and Evaluation 25:20: CSS: Contrastive Semantic Similarity for Uncertainty Quantification of LLMs 27:03: StatBot.Swiss: Bilingual Open Data Exploration in Natural Language 28:10: Missci: Reconstructing Fallacies in Misrepresented Science 29:43: ChatLang-8: An LLM-Based Synthetic Data Generation Framework for Grammatical Error Correction 30:47: Linking Named Entities in Diderot's \textit{Encyclop\'edie} to Wikidata 32:06: Error-preserving Automatic Speech Recognition of Young English Learners' Language 33:37: Document-level Claim Extraction and Decontextualisation for Fact-Checking 34:45: The Challenges of Evaluating LLM Applications: An Analysis of Automated, Human, and LLM-Based Approaches 36:09: LLM-based Rewriting of Inappropriate Argumentation using Reinforcement Learning from Machine Feedback 37:39: IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models 39:46: Automating Turkish Educational Quiz Generation Using Large Language Models 41:34: Cycles of Thought: Measuring LLM Confidence through Stable Explanations 42:57: Are language models rational? The case of coherence norms and belief revision 43:58: What is the Best Way for ChatGPT to Translate Poetry? 45:20: Using Synchronic Definitions and Semantic Relations to Classify Semantic Change Types 46:14: MODABS: Multi-Objective Learning for Dynamic Aspect-Based Summarization 47:09: BIPED: Pedagogically Informed Tutoring System for ESL Education 48:24: Analyzing LLM Behavior in Dialogue Summarization: Unveiling Circumstantial Hallucination Trends 50:00: Wings: Learning Multimodal LLMs without Text-only Forgetting

Jun 6, 2024

51m

259

Ep. 254 - Part 2 - June 4, 2024

ArXiv NLP research for Tuesday, June 04, 2024. 00:20: Description Boosting for Zero-Shot Entity and Relation Classification 01:44: Modeling Emotional Trajectories in Written Stories Utilizing Transformers and Weakly-Supervised Learning 03:09: Enhancing Retrieval-Augmented LMs with a Two-stage Consistency Learning Compressor 04:30: Prompting Large Language Models with Human Error Markings for Self-Correcting Machine Translation 05:41: mCoT: Multilingual Instruction Tuning for Reasoning Consistency in Language Models 06:53: Technical Language Processing for Telecommunications Specifications 08:09: On Affine Homotopy between Language Encoders 09:25: Translation Deserves Better: Analyzing Translation Artifacts in Cross-lingual Visual Question Answering 10:32: Probing the Category of Verbal Aspect in Transformer Language Models 11:58: Linguistic Fingerprint in Transformer Models: How Language Variation Influences Parameter Selection in Irony Detection 13:03: LlamaCare: A Large Medical Language Model for Enhancing Healthcare Knowledge Sharing 14:33: Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs 15:51: On the Intrinsic Self-Correction Capability of LLMs: Uncertainty and Latent Concept 17:30: Multiple Choice Questions and Large Languages Models: A Case Study with Fictional Medical Data 19:08: The Scandinavian Embedding Benchmarks: Comprehensive Assessment of Multilingual and Monolingual Text Embedding 20:07: Representations as Language: An Information-Theoretic Framework for Interpretability 21:32: Analyzing Temporal Complex Events with Large Language Models? A Benchmark towards Temporal, Long Context Understanding 22:46: Hiding Text in Large Language Models: Introducing Unconditional Token Forcing Confusion 24:21: Language-Universal Speech Attributes Modeling for Zero-Shot Multilingual Spoken Keyword Recognition 25:37: Deterministic Reversible Data Augmentation for Neural Machine Translation 26:39: CheckEmbed: Effective Verification of LLM Solutions to Open-Ended Tasks 28:14: Scalable MatMul-free Language Modeling 30:03: SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices 31:37: Mitigate Position Bias in Large Language Models via Scaling a Single Dimension 33:10: TopViewRS: Vision-Language Models as Top-View Spatial Reasoners

Jun 5, 2024

35m

258

Ep. 254 - Part 1 - June 4, 2024

ArXiv NLP research for Tuesday, June 04, 2024. 00:20: Conditional Language Learning with Context 01:13: Zyda: A 1.3T Dataset for Open Language Modeling 02:32: RKLD: Reverse KL-Divergence-based Knowledge Distillation for Unlearning Personal Information in Large Language Models 03:50: Personalized Topic Selection Model for Topic-Grounded Dialogue 05:20: Position Debiasing Fine-Tuning for Causal Perception in Long-Term Dialogue 06:58: Phonetic Enhanced Language Modeling for Text-to-Speech Synthesis 08:03: Why Would You Suggest That? Human Trust in Language Model Responses 09:10: Multimodal Reasoning with Multimodal Knowledge Graph 10:30: QROA: A Black-Box Query-Response Optimization Attack on LLMs 11:55: Analyzing Social Biases in Japanese Large Language Models 12:52: I've got the "Answer"! Interpretation of LLMs Hidden States in Question Answering 13:47: PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling 15:16: Assessing the Performance of Chinese Open Source Large Language Models in Information Extraction Tasks 16:38: LongSSM: On the Length Extension of State-space Models in Language Modelling 17:30: Exploring Mathematical Extrapolation of Large Language Models with Synthetic Data 18:40: MARS: Benchmarking the Metaphysical Reasoning Abilities of Language Models with a Multi-task Evaluation Dataset 20:19: UniOQA: A Unified Framework for Knowledge Graph Question Answering with Large Language Models 22:03: Diver: Large Language Model Decoding with Span-Level Mutual Information Verification 23:12: SimulTron: On-Device Simultaneous Speech to Speech Translation 24:28: The current status of large language models in summarizing radiology report impressions 26:10: Reinforcement Tuning for Detecting Stances and Debunking Rumors Jointly with Large Language Models 27:17: Synergetic Event Understanding: A Collaborative Approach to Cross-Document Event Coreference Resolution with Large Language Models 28:46: A multilingual dataset for offensive language and hate speech detection for hausa, yoruba and igbo languages 29:40: FedMKT: Federated Mutual Knowledge Transfer for Large and Small Language Models 31:17: Self-Modifying State Modeling for Simultaneous Machine Translation

Jun 5, 2024

33m

257

Ep. 253 - June 3, 2024

ArXiv NLP research for Monday, June 03, 2024. 00:19: Luna: An Evaluation Foundation Model to Catch Language Model Hallucinations with High Accuracy and Low Cost 01:38: Generative Pre-trained Speech Language Model with Efficient Hierarchical Transformer 03:06: Selectively Answering Visual Questions 04:11: Take its Essence, Discard its Dross! Debiasing for Toxic Language Detection via Counterfactual Causal Effect 05:36: Predicting Drug-Gene Relations via Analogy Tasks with Word Embeddings 06:51: SemCoder: Training Code Language Models with Comprehensive Semantics 08:39: Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration 10:26: Combining Qualitative and Computational Approaches for Literary Analysis of Finnish Novels 11:45: Strengthened Symbol Binding Makes Large Language Models Reliable Multiple-Choice Selectors 13:26: Decompose, Enrich, and Extract! Schema-aware Event Extraction using LLMs 14:34: MACT: Model-Agnostic Cross-Lingual Training for Discourse Representation Structure Parsing 15:48: Guiding ChatGPT to Generate Salient Domain Summaries 17:51: Synergizing Unsupervised and Supervised Learning: A Hybrid Approach for Accurate Natural Language Task Modeling 19:30: TCMBench: A Comprehensive Benchmark for Evaluating Large Language Models in Traditional Chinese Medicine 21:38: Explore then Determine: A GNN-LLM Synergy Framework for Reasoning over Knowledge Graph 22:51: Two Tales of Persona in LLMs: A Survey of Role-Playing and Personalization 24:08: Are AI-Generated Text Detectors Robust to Adversarial Perturbations? 25:42: Automatic Essay Multi-dimensional Scoring with Fine-tuning and Multiple Regression 26:35: Improving Pseudo Labels with Global-Local Denoising Framework for Cross-lingual Named Entity Recognition 28:01: Demonstration Augmentation for Zero-shot In-context Learning 29:31: EffiQA: Efficient Question-Answering with Strategic Multi-Model Collaboration on Knowledge Graphs 31:05: Towards Scalable Automated Alignment of LLMs: A Survey 32:19: EduNLP: Towards a Unified and Modularized Library for Educational Resources 33:44: Focus on the Core: Efficient Attention via Pruned Token Compression for Document Classification 35:07: Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses 36:36: When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMs 37:58: CodeR: Issue Resolving with Multi-Agent and Task Graphs 38:54: Unsupervised Distractor Generation via Large Language Model Distilling and Counterfactual Contrastive Decoding 40:10: FactGenius: Combining Zero-Shot Prompting and Fuzzy Relation Mining to Improve Fact Verification with Knowledge Graphs 41:27: Probing Language Models for Pre-training Data Detection 42:45: R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models 44:32: Privacy in LLM-based Recommendation: Recent Advances and Future Directions 45:23: Linguistic Analysis, Description, and Typological Exploration with Categorial Grammar (TheBench Guide) 46:52: D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models 48:52: Do Large Language Models Perform the Way People Expect? Measuring the Human Generalization Function 50:07: Sparsity-Accelerated Training for Large Language Models 51:36: Superhuman performance in urology board questions by an explainable large language model enabled for context integration of the European Association of Urology guidelines: the UroBot study 53:34: Editing the Mind of Giants: An In-Depth Exploration of Pitfalls of Knowledge Editing in Large Language Models 54:42: LexMatcher: Dictionary-centric Data Collection for LLM-based Machine Translation 55:55: Enabling ASR for Low-Resource Languages: A Comprehensive Dataset Creation Approach 57:10: Understanding Token Probability Encoding in Output Embeddings

Jun 4, 2024

1h 10m

256

Ep. 252 - June 2, 2024

ArXiv NLP research for Sunday, June 02, 2024. 00:19: Prompt Framework for Role-playing: Generation and Evaluation 01:05: Transforming Computer Security and Public Trust Through the Exploration of Fine-Tuning Large Language Models 02:18: Enhancing Zero-shot Text-to-Speech Synthesis with Human Feedback 03:54: Presence or Absence: Are Unknown Word Usages in Dictionaries? 05:09: Topic Modeling for Short Texts with Large Language Models 06:09: How well do distributed representations convey contextual lexical semantics: a Thesis Proposal 07:05: Evaluating Mathematical Reasoning of Large Language Models: A Focus on Error Identification and Correction 08:27: Automatic Instruction Evolving for Large Language Models 09:25: Applying Intrinsic Debiasing on Downstream Tasks: Challenges and Considerations for Machine Translation 10:26: Developing an efficient corpus using Ensemble Data cleaning approach 11:51: BoNBoN Alignment for Large Language Models and the Sweetness of Best-of-n Sampling 13:15: FOCUS: Forging Originality through Contrastive Use in Self-Plagiarism for Language Models 14:51: The Power of Summary-Source Alignments 16:11: Formality Style Transfer in Persian 17:39: Show, Don't Tell: Aligning Language Models with Demonstrated Feedback 19:08: YODAS: Youtube-Oriented Dataset for Audio and Speech 20:13: MEDIQ: Question-Asking LLMs for Adaptive and Reliable Medical Reasoning 22:15: A Survey of Useful LLM Evaluation 23:31: Unveil the Duality of Retrieval-Augmented Generation: Theoretical Analysis and Practical Solution 25:07: Annotation Guidelines-Based Knowledge Augmentation: Towards Enhancing Large Language Models for Educational Text Classification 27:18: Using RL to Identify Divisive Perspectives Improves LLMs Abilities to Identify Communities on Social Media

Jun 4, 2024

28m

255

Ep. 251 - June 1, 2024

ArXiv NLP research for Saturday, June 01, 2024. 00:19: Multi-Dimensional Optimization for Text Summarization via Reinforcement Learning 01:41: CASE: Curricular Data Pre-training for Building Generative and Discriminative Assistive Psychology Expert Models 03:25: Beyond Metrics: Evaluating LLMs' Effectiveness in Culturally Nuanced, Low-Resource Real-World Scenarios 05:03: RoBERTa-BiLSTM: A Context-Aware Hybrid Model for Sentiment Analysis 07:09: The Best of Both Worlds: Toward an Honest and Helpful Large Language Model 09:02: Gender Bias Detection in Court Decisions: A Brazilian Case Study 10:41: Prompt Chaining or Stepwise Prompt? Refinement in Text Summarization 11:54: A Survey on Large Language Models for Code Generation 13:43: Guiding and Diversifying LLM-Based Story Generation via Answer Set Programming 14:46: SPAGHETTI: Open-Domain Question Answering from Heterogeneous Data Sources with Retrieval and Semantic Parsing 15:43: LongSkywork: A Training Recipe for Efficiently Extending Context Length in Large Language Models 17:24: LLMs Could Autonomously Learn Without External Supervision

Jun 4, 2024

18m

254

Ep. 250 - May 31, 2024

ArXiv NLP research summaries for May 31, 2024. 00:20 FineRadScore: A Radiology Report Line-by-Line Evaluation Technique Generating Corrections with Severity Scores 01:37 Leveraging Large Language Models for Entity Matching 02:27 Reward-based Input Construction for Cross-document Relation Extraction 03:40 Passage-specific Prompt Tuning for Passage Reranking in Question Answering with Large Language Models 05:04 DORY: Deliberative Prompt Recovery for LLM 06:18 Unveiling the Lexical Sensitivity of LLMs: Combinatorial Optimization for Prompt Enhancement 07:35 It is Simple Sometimes: A Study On Improving Aspect-Based Sentiment Analysis Performance 08:59 FinGen: A Dataset for Argument Generation in Finance 09:42 Improving code-mixed hate detection by native sample mixing: A case study for Hindi-English code-mixed scenario 11:26 Multilingual Text Style Transfer: Datasets & Models for Indian Languages 13:01 An iterated learning model of language change that mixes supervised and unsupervised learning 14:01 Self-Augmented Preference Optimization: Off-Policy Paradigms for Language Model Alignment 15:29 That's Optional: A Contemporary Exploration of "that" Omission in English Subordinate Clauses 16:18 Don't Buy it! Reassessing the Ad Understanding Abilities of Contrastive Multimodal Models 17:20 Improving Reward Models with Synthetic Critiques 18:29 Towards Spoken Language Understanding via Multi-level Multi-grained Contrastive Learning 19:49 clembench-2024: A Challenging, Dynamic, Complementary, Multilingual Benchmark and Underlying Flexible Framework for LLMs as Multi-Action Agents 21:05 A comparison of correspondence analysis with PMI-based word embedding methods 22:05 Large Language Models: A New Approach for Privacy Policy Analysis at Scale 23:36 Preemptive Answer "Attacks" on Chain-of-Thought Reasoning 24:22 Learning to Estimate System Specifications in Linear Temporal Logic using Transformers and Mamba 25:48 OR-Bench: An Over-Refusal Benchmark for Large Language Models 27:20 Superlatives in Context: Explicit and Implicit Domain Restrictions for Superlative Frames 28:41 SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales 30:33 Towards a Fluid computer 31:33 You Only Scan Once: Efficient Multi-dimension Sequential Modeling with LightNet 33:01 LACIE: Listener-Aware Finetuning for Confidence Calibration in Large Language Models 35:02 Direct Alignment of Language Models via Quality-Aware Self-Refinement 36:19 Code Pretraining Improves Entity Tracking Abilities of Language Models

Jun 3, 2024

37m

253

Ep. 249 - May 30, 2024

ArXiv NLP research summaries for May 30, 2024.

May 31, 2024

1h 02m

252

Ep. 248 - May 29, 2024

ArXiv NLP research summaries for May 29, 2024.

May 30, 2024

43m

251

Ep. 247 - May 28, 2024

ArXiv NLP research summaries for May 28, 2024.

May 30, 2024

1h 01m

250

Ep. 246 - May 27, 2024

ArXiv NLP summaries for May 27, 2024

May 29, 2024

47m

249

Ep. 245 - May 26, 2024

ArXiv NLP research summaries for May 26, 2024.

May 29, 2024

31m

248

Ep. 244 - May 25, 2024

ArXiv NLP research summaries for May 25, 2024.

May 29, 2024

29m

247

Ep. 243 - May 24, 2024

ArXiv NLP research summaries for May 24, 2024.

May 29, 2024

37m

246

Ep. 242 - Part 2 - May 23, 2024

arXiv NLP research summaries for May 23, 2024. Today's Research Themes (AI-Generated): • Exploring efficient model scaling with Super Tiny Language Models, significantly reducing parameters while maintaining performance. • Advancing unsupervised adaptation in speech recognition with Self-TAught Recognizer, achieving robustness across diverse domains. • Introducing Large Language Models-guided adaptation for Temporal Knowledge Graph Reasoning, offering interpretable and dynamically updated reasoning. • Assessing the impact of inflectional endings in morphological analysis for the Uzbek language, enhancing word-level accuracy. • Investigating universal goal hijacking in LLMs with Semantic-guided Prompt Organization, confirming model vulnerability to targeted responses.

May 24, 2024

38m

245

Ep. 242 - Part 1 - May 23, 2024

arXiv NLP research summaries for May 23, 2024. Today's Research Themes (AI-Generated): • Exploring efficient model scaling with Super Tiny Language Models, significantly reducing parameters while maintaining performance. • Advancing unsupervised adaptation in speech recognition with Self-TAught Recognizer, achieving robustness across diverse domains. • Introducing Large Language Models-guided adaptation for Temporal Knowledge Graph Reasoning, offering interpretable and dynamically updated reasoning. • Assessing the impact of inflectional endings in morphological analysis for the Uzbek language, enhancing word-level accuracy. • Investigating universal goal hijacking in LLMs with Semantic-guided Prompt Organization, confirming model vulnerability to targeted responses.

May 24, 2024

42m

244

Ep. 241 - May 22, 2024

arXiv NLP research summaries for May 22, 2024. Today's Research Themes (AI-Generated): • Mosaic Instruction Tuning (Mosaic-IT) enhances LLMs by creating diverse instruction data, significantly reducing training costs. • Cross-subject classifiers and GPT2 word prediction improve P300 spellers, enhancing communication for ALS patients. • Dynamic vocabulary in ASR improves recognition performance for phrases, eliminating subword dependencies. • ByteT5 shows promise in multilingual translation of Biblical texts, potentially serving underrepresented language communities. • Zero-shot Adaptive Post Training Quantization method, AdpQ, improves LLM deployment efficiency without the need for calibration data.

May 24, 2024

1h 01m

243

Ep. 240 - May 21, 2024

arXiv NLP research summaries for May 21, 2024. Today's Research Themes (AI-Generated): • A new method is proposed for the scalable and precise identification of crucial 'circuits' within large language models using sparse autoencoders. • SirLLM enhances Large Language Models (LLMs) with the ability to maintain extended memory for infinite-length dialogues without fine-tuning. • Pyramid KV cache compression is introduced to significantly increase the throughput and decrease memory usage in LLM inference. • ProtT3, a Protein-to-Text Generation framework, is developed to aid Language Models in understanding and generating information from amino acid sequences. • Self-instruction based fine-tuning is shown to balance fact-checking accuracy and explainability in LLMs, while ensuring data security.

May 22, 2024

40m

242

Ep. 239 - May 20, 2024

arXiv NLP research summaries for May 20, 2024. Today's Research Themes (AI-Generated): • Advancements in ordinal classification techniques for NLP, focusing on explicit and implicit approaches within pretrained language models. • Multi-agent framework leveraging large language models for translating ultra-long literary texts, introducing innovative evaluation strategies. • Exploration of SEARNN as an alternative training approach for RNNs, demonstrating improved machine translation for low-resourced African languages. • Introduction of CoNLL#, a fine-grained error analysis and corrected test set for improved Named Entity Recognition evaluation. • Intuitive Fine-Tuning method aligns Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) for language model optimization.

May 22, 2024

52m

241

Ep. 238 - May 19, 2024

arXiv NLP research summaries for May 19, 2024. Today's Research Themes (AI-Generated): • OpenRLHF introduces a scalable RLHF framework for training large language models efficiently with optimized algorithms. • MAML-en-LLM shows significant improvements in in-context learning of LLMs for better adaptation to unseen tasks. • Du-IN model achieves a breakthrough in speech decoding from intracranial neural signals using discrete units-guided mask modeling. • Efficient Prompt Tuning (EPT) enhances prompt tuning for LLMs by reducing training time and increasing performance consistency. • New research explores the predictive capabilities of LLMs within the educational sector and assesses their potential against human expertise.

May 22, 2024

22m

240

Ep. 237 - May 18, 2024

arXiv NLP research summaries for May 18, 2024. Today's Research Themes (AI-Generated): • Advancing mental health care through the automation of PTSD diagnostics using large language models. • Pioneering the development of a discourse-aware, knowledge-infused model to improve automated disease diagnosis. • Enhancing trust in annotation tasks by identifying LLM-generated content using novel topical embeddings. • Introducing perspectivist NLP datasets that account for diverse worldviews, improving language model generalization. • Developing multi-domain, multilingual lexicon generation to support language diversity in technical fields.

May 22, 2024

30m

239

Ep. 236 - May 17, 2024

arXiv Computer Vision research summaries for May 17, 2024. Today's Research Themes (AI-Generated): • VLMs safeguarded against patched visual prompt injectors through pixel-wise randomization and SmoothVLM framework • CM-UNet combines CNN and Mamba for efficient semantic segmentation of remote sensing images • LighTDiff employs a lightweight DDPM for enhanced low-light image enhancement in surgical endoscopy • NeRO MLP-based method offers improvements in autonomous driving through accurate road surface reconstruction • SymCode and SymNet introduced to resolve symmetry ambiguity in 6D pose estimation of symmetric objects

May 20, 2024

35m

238

Ep. 235 - May 16, 2024

arXiv NLP research summaries for May 16, 2024. Today's Research Themes (AI-Generated): • SecureLLM proposes a new secure LLM architecture for handling sensitive data through fine-tuning data silos and user-specific access. • Chameleon presents a mixed-modal early-fusion foundation model offering state-of-the-art image captioning and competitive long-form mixed-modal generation. • Enhancement of multimodal Chain of Thought reasoning through soft negative sampling to reduce hallucination in model outputs is demonstrated. • A study underlines the importance of pre-neural NLP approaches in educational curricula to build foundational understanding despite the dominance of neural methods. • Information Gain Optimized Tokenizer (IGOT) method introduced for domain-adaptive pretraining, offering computational efficiency and customization.

May 17, 2024

37m

237

Ep. 234 - May 15, 2024

arXiv NLP research summaries for May 15, 2024. Today's Research Themes (AI-Generated): • Novel AMR parser for clinical notes achieves high accuracy on cancer data, demonstrating potential for structured semantic analysis in healthcare. • HumanRankEval (HRE) proposed for evaluating conversational LMs, showcasing effectiveness in ranking model responses aligned with human judgment. • Research reveals BERT's superior performance over traditional methods in identifying online homophobic content, with an open-source dataset contribution. • Study shows word alignment optimization can mitigate hallucination and omission issues in large language model-based machine translation. • Large language models examined for psychological support potential, finding GPT-4 to provide more empathetic responses than Chat-GPT.

May 16, 2024

21m

236

Ep. 233 - May 14, 2024

arXiv NLP research summaries for May 14, 2024. Today's Research Themes (AI-Generated): • Exploring a novel model for joint extraction of entities and relations with enhanced information interaction in NLP. • Investigating adversarial robustness and countermeasures of multimodal speech-language models. • Introducing Seal-Tools, a self-instruct learning dataset for agent tuning and benchmarking in language models. • Addressing error correction in clinical text using ensembles of large language models and error categorization. • Proposing stylometric watermarks to distinguish between human and large language model-generated texts.

May 15, 2024

30m

235

Ep. 232 - May 13, 2024

arXiv NLP research summaries for May 13, 2024. Today's Research Themes (AI-Generated): • Improved Text-to-SQL generation via multiple prompts and choice selection outperforms prior in-context learning methods. • Evaluating large language models in medical applications presents unique challenges and opportunities for integration into clinical practice. • Curriculum learning strategies show potential for enhancing Large Language Model performance without scaling model size. • 'MacBehaviour' R package enables behavioral experimentation with various large language models for psychological studies. • Efficient Multi-sample Speculative Decoding 'EMS-SD' accelerates Large Language Models inference processes.

May 14, 2024

39m

234

Ep. 231 - May 12, 2024

arXiv Computer Vision research summaries for May 12, 2024. Today's Research Themes (AI-Generated): • Enhancing multi-modal machine learning with Meta-learned Cross-modal Knowledge Distillation for improved performance on tasks with missing modalities. • Developing resource-efficient semi-self-supervised domain adaptation techniques for precise agricultural tasks in varying conditions. • Proposing Energy Plan Denoising for stochastic trajectory prediction acknowledging pedestrian intrinsic uncertainties in autonomous driving. • Introducing memory-efficient image processing frameworks for high-resolution vision systems to facilitate detailed object insights. • Demonstrating advanced 3D hand mesh recovery from monocular RGB for improved human-computer interaction in complex environments.

May 14, 2024

12m

233

Ep. 230 - May 11, 2024

arXiv NLP research summaries for May 11, 2024. Today's Research Themes (AI-Generated): • Combinatoriality in human language reflected through library learning in Chinese writing system evolution. • Code Representation and Execution (CoRE) enabling natural language, pseudo-code, and flow programming via large language models. • Introduction of EmoMix-3L, a novel code-mixed dataset for multilingual emotion detection in Bangla, English, and Hindi. • Piccolo2 sets new benchmark in text embedding with multi-task hybrid loss training for diverse NLP tasks. • AraSpell framework advances Arabic spelling correction using deep learning and artificial data generation.

May 14, 2024

19m

232

Ep. 229 - May 10, 2024

arXiv NLP research summaries for May 10, 2024. Today's Research Themes (AI-Generated): • SaudiBERT demonstrates superior performance in processing Arabic text in Saudi dialect, outperforming multi-dialectal models. • Automated generation of model and data cards by Large Language Models promotes responsible AI through enhanced documentation. • D-Pruner introduces domain-specific, task-agnostic compression to improve the efficiency of Large Language Models. • Aspect-based summarization of health answers from CQA forums captures diverse opinions and solutions, improving platform usability. • Exploratory study of gender bias in Hindi language technology reveals the importance of context-specific approaches.

May 13, 2024

32m

231

Ep. 228 - May 9, 2024

arXiv NLP research summaries for May 09, 2024. Today's Research Themes (AI-Generated): • Cline dataset adds human acceptability judgments to English-Hindi code-mixed text, enhancing natural language processing models. • OpenFactCheck introduces a unified factuality evaluation for large language models, aiming to ensure output accuracy. • G-SAP integrates knowledge graphs with language models to improve commonsense reasoning and cross-modal knowledge transfer. • Assessing dialect robustness of language models reveals significant performance disparity across English dialects. • Novel Chain of Attack method exposes vulnerabilities of LLMs in multi-turn dialogues by adjusting attack strategies contextually.

May 10, 2024

25m

230

Ep. 227 - May 8, 2024

arXiv NLP research summaries for May 08, 2024. Today's Research Themes (AI-Generated): • ACORN dataset provides insights into LLMs' explanation evaluation consistency compared to human raters. • DALK framework enhances LLM capabilities for specialized domain knowledge integration, demonstrated on Alzheimer's Disease. • APrompt4EM introduces augmented prompt tuning, showing promise in low-resource generalized entity matching. • ChuXin is a fully open-source 1.6B parameter language model aimed at increasing transparency and fostering innovation. • Two innovative approaches to NER: human-annotated corpora for Indian languages and prompt-based logical reasoning enhancement.

May 9, 2024

29m

229

Ep. 226 - May 7, 2024

arXiv NLP research summaries for May 07, 2024. Today's Research Themes (AI-Generated): • Hybrid AI strategy reduces hallucinations in text summarization using GPT refinement process. • Philosophy of cognitive science gains new insights from the progress in deep learning. • GPT models demonstrate robust evaluation potential for transformer-based text summaries. • FlashBack enhances inference efficiency in Retrieval-Augmented Language Modeling (RALM). • PuzzleBen, a weakly supervised benchmark, advances LLMs’ reasoning abilities with minimal human supervision.

May 8, 2024

38m

228

Ep. 225 - May 6, 2024

arXiv Computer Vision research summaries for May 06, 2024. Today's Research Themes (AI-Generated): • Diffusion models showcase potential for high-quality video generation and modification. • Deep learning frameworks advance multi-parametric estimation in MR imaging. • Enhanced multimodal AI models set new standards in medical data analysis. • Improved cross-modal feature fusion drives RGB-T tracking performance. • Dual-encoder models trained with language corpora improve paraphrased query retrieval.

May 7, 2024

22m

227

Ep. 224 - May 5, 2024

arXiv NLP research summaries for May 05, 2024. Today's Research Themes (AI-Generated): • Exploring negative emotional stimuli can enhance LLMs' performance on complex tasks. • A novel concept-based RAG framework utilizes AMR to enhance information retrieval in LLMs. • The FairMonitor framework adopts a static-dynamic method for detecting biases in LLMs. • Cultural reasoning in LLMs is improved through tuning with culturally-related instruction datasets. • A fully differentiable MoE architecture for language model pre-training exhibits superior performance.

May 7, 2024

26m

226

Ep. 223 - May 4, 2024

arXiv NLP research summaries for May 04, 2024. Today's Research Themes (AI-Generated): • A proposed QUEST framework aims to standardize human evaluation of LLMs in healthcare, ensuring safety and reliability. • A new Transformer and BERT-based model advances Vietnamese spelling correction, outperforming Google Docs. • Astro-NER explores LLMs as domain expert annotators, contributing to named entity recognition in astronomy literature. • The Mixat dataset enhances Arabic-English ASR by addressing code-switching in Emirati speech. • Research demonstrates LLMs' potential in historical analysis, unveiling narrative patterns in Holocaust testimonies.

May 7, 2024

20m

225

Ep. 222 - May 3, 2024

arXiv NLP research summaries for May 03, 2024. Today's Research Themes (AI-Generated): • SGHateCheck introduces a hate speech detection framework tailored to Singapore's linguistic diversity, enhancing content moderation. • SUKHSANDESH, an AI-based sexual education platform for rural India, leverages avatar therapy to increase empathy and understanding. • Research demonstrates how external knowledge and goal-planning improve conversational recommender systems powered by Large Language Models. • An advanced Bi-LSTM model enhances next-word prediction and sentence completion for the Bangla language, reaching high accuracy levels. • The study on DALLMi explores semi-supervised domain adaption in text multi-label classification with Large Language Models.

May 6, 2024

40m

224

Ep. 221 - May 2, 2024

arXiv NLP research summaries for May 02, 2024. Today's Research Themes (AI-Generated): • IgboAPI dataset advances Igbo language technologies by enriching machine translation and semantic lexicon with multi-dialectal data. • UniGen addresses domain generalization in sentiment analysis through universal zero-shot dataset generation, enhancing small model applicability. • Efficient data generation for dialogue systems is demonstrated by combining large language model prompting with human expertise in MISeD dataset creation. • Challenges in modelling human dialogue acts for grounding communication are analyzed, highlighting the limits of supervised learning-based NLP dialogue models. • The TartuNLP team's first-place win in EvaLatin 2024 showcases the efficacy of Large Language Model-aided annotation for emotion polarity detection in historical Latin texts.

May 3, 2024

32m

223

Ep. 220 - May 1, 2024

arXiv NLP research summaries for May 01, 2024. Today's Research Themes (AI-Generated): • Advances in large language models (LLMs) emphasize robustness and alignment with human values. • Novel fine-tuning techniques enhance model performance without a significant increase in computational costs. • Multimodal models, incorporating text and imagery, show progress in complex tasks like sarcasm detection and sentiment analysis. • Investigations reveal the persistence of biases in language models, even absent gender-related language in inputs. • Model quantization techniques aim to balance model confidence and performance with efficiency and lower resource consumption.

May 2, 2024

30m

222

Ep. 219 - April 30, 2024

arXiv NLP research summaries for April 30, 2024. Today's Research Themes (AI-Generated): • HydraLoRA outperforms other Parameter-Efficient Fine-Tuning methods by leveraging asymmetric structures without the need for domain knowledge. • ViTHSD dataset contributes to improved targeted hate speech detection on Vietnamese social media using a hybrid deep learning model. • New benchmark 'Suvach' elevates Hindi question answering by generating datasets specifically crafted for Hindi language models. • Graph Attention Network enhanced by dependency tree structures achieves state-of-the-art performance in aspect and opinion term extraction. • Octopus v4 integrates multiple open-source language models to handle specialized tasks, challenging the dominance of proprietary models like GPT-4.

May 1, 2024

42m

Ep. 263 - Part 2 - June 13, 2024

Ep. 263 - Part 1 - June 13, 2024

Ep. 262 - June 12, 2024

Ep. 261 - Part 2 - June 11, 2024

Ep. 261 - Part 1 - June 11, 2024

Ep. 260 - June 10, 2024

Ep. 259 - June 9, 2024

Ep. 258 - June 8, 2024

Ep. 257 - June 7, 2024

Ep. 256 - Part 2 - June 6, 2024

Ep. 256 - Part 1 - June 6, 2024

Ep. 255 - June 5, 2024

Ep. 254 - Part 2 - June 4, 2024

Ep. 254 - Part 1 - June 4, 2024

Ep. 253 - June 3, 2024

Ep. 252 - June 2, 2024

Ep. 251 - June 1, 2024

Ep. 250 - May 31, 2024

Ep. 249 - May 30, 2024

Ep. 248 - May 29, 2024

Ep. 247 - May 28, 2024

Ep. 246 - May 27, 2024

Ep. 245 - May 26, 2024

Ep. 244 - May 25, 2024

Ep. 243 - May 24, 2024

Ep. 242 - Part 2 - May 23, 2024

Ep. 242 - Part 1 - May 23, 2024

Ep. 241 - May 22, 2024

Ep. 240 - May 21, 2024

Ep. 239 - May 20, 2024

Ep. 238 - May 19, 2024

Ep. 237 - May 18, 2024

Ep. 236 - May 17, 2024

Ep. 235 - May 16, 2024

Ep. 234 - May 15, 2024

Ep. 233 - May 14, 2024

Ep. 232 - May 13, 2024

Ep. 231 - May 12, 2024

Ep. 230 - May 11, 2024

Ep. 229 - May 10, 2024

Ep. 228 - May 9, 2024

Ep. 227 - May 8, 2024

Ep. 226 - May 7, 2024

Ep. 225 - May 6, 2024

Ep. 224 - May 5, 2024

Ep. 223 - May 4, 2024

Ep. 222 - May 3, 2024

Ep. 221 - May 2, 2024

Ep. 220 - May 1, 2024

Ep. 219 - April 30, 2024

Authentication Required