TechcraftingAI NLP

PODCAST · technology

TechcraftingAI NLP

TechcraftingAI NLP brings you daily summaries of the latest arXiv Computation and Language research.

  1. 271

    Ep. 263 - Part 2 - June 13, 2024

    ArXiv NLP research for Thursday, June 13, 2024. 00:20: Chain-of-Though (CoT) prompting strategies for medical error detection and correction 01:31: CoastTerm: a Corpus for Multidisciplinary Term Extraction in Coastal Scientific Literature 02:52: RH-SQL: Refined Schema and Hardness Prompt for Text-to-SQL 04:01: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs 05:24: Leveraging Explicit Reasoning for Inference Integration in Commonsense-Augmented Dialogue Models 06:38: Investigating the translation capabilities of Large Language Models trained on parallel data only 07:56: LASER: Learning by Aligning Self-supervised Representations of Speech for Improving Content-related Tasks 09:09: DefAn: Definitive Answer Dataset for LLMs Hallucination Evaluation 11:20: Test of Time: A Benchmark for Evaluating LLMs on Temporal Reasoning 12:46: Orthogonality and isotropy of speaker and phonetic information in self-supervised speech representations 13:53: Language Complexity and Speech Recognition Accuracy: Orthographic Complexity Hurts, Phonological Complexity Doesn't 14:47: ReadCtrl: Personalizing text generation with readability-controlled instruction learning 16:32: Self-Training for Sample-Efficient Active Learning for Text Classification with Pre-Trained Language Models 17:49: Sharing Matters: Analysing Neurons Across Languages and Tasks in LLMs 19:18: End-to-end Streaming model for Low-Latency Speech Anonymization 20:22: Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback 22:25: On the Effects of Heterogeneous Data Sources on Speech-to-Text Foundation Models 23:33: Understanding Jailbreak Success: A Study of Latent Space Dynamics in Large Language Models 24:35: Exploring Spoken Language Identification Strategies for Automatic Transcription of Multilingual Broadcast and Institutional Speech 25:47: AlignMMBench: Evaluating Chinese Multimodal Alignment in Large Vision-Language Models 27:15: Transformers meet Neural Algorithmic Reasoners 28:32: REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space 30:02: Learning from Natural Language Explanations for Generalizable Entity Matching 31:14: ProxyLM: Predicting Language Model Performance on Multilingual Tasks via Proxy Models 32:29: DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding 33:43: Improving Autoregressive Training with Dynamic Oracles

  2. 270

    Ep. 263 - Part 1 - June 13, 2024

    ArXiv NLP research for Thursday, June 13, 2024. 00:20: Deep Exploration of Cross-Lingual Zero-Shot Generalization in Instruction Tuning 01:53: Mixture-of-Skills: Learning to Optimize Data Usage for Fine-Tuning Large Language Models 03:26: Automated Essay Scoring Using Grammatical Variety and Errors with Multi-Task Learning and Item Response Theory 04:33: Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination 06:05: DisfluencySpeech -- Single-Speaker Conversational Speech Dataset with Paralanguage 07:26: Research on Optimization of Natural Language Processing Model Based on Multimodal Deep Learning 08:41: ContraSolver: Self-Alignment of Language Models by Resolving Internal Preference Contradictions 10:07: An Approach to Build Zero-Shot Slot-Filling System for Industry-Grade Conversational Assistants 11:42: Plan, Generate and Complicate: Improving Low-resource Dialogue State Tracking via Easy-to-Difficult Zero-shot Data Augmentation 12:42: No perspective, no perception!! Perspective-aware Healthcare Answer Summarization 14:28: Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for Large Language Models 16:02: An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios 17:21: Navigating the Shadows: Unveiling Effective Disturbances for Modern AI Content Detectors 18:48: Exploring Multilingual Unseen Speaker Emotion Recognition: Leveraging Co-Attention Cues in Multitask Learning 19:52: Word Order in English-Japanese Simultaneous Interpretation: Analyses and Evaluation using Chunk-wise Monotonic Translation 21:12: Multi-Agent Software Development through Cross-Team Collaboration 22:55: LLM Reading Tea Leaves: Automatically Evaluating Topic Models with Large Language Models 24:14: Bayesian Statistical Modeling with Predictors from LLMs 25:39: ME-Switch: A Memory-Efficient Expert Switching Framework for Large Language Models 27:28: Language Models are Crossword Solvers 28:32: MiLoRA: Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning 29:51: CUDRT: Benchmarking the Detection of Human vs. Large Language Models Generated Texts 31:29: Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning? 32:59: 3M: Multi-modal Multi-task Multi-teacher Learning for Game Event Detection 34:08: Modeling Comparative Logical Relation with Contrastive Learning for Text Generation 35:42: SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models

  3. 269

    Ep. 262 - June 12, 2024

    ArXiv NLP research for Wednesday, June 12, 2024. 00:19: VALL-E R: Robust and Efficient Zero-Shot Text-to-Speech Synthesis via Monotonic Alignment 02:05: BookSQL: A Large Scale Text-to-SQL Dataset for Accounting Domain 03:15: Designing a Dashboard for Transparency and Control of Conversational AI 04:46: Label-aware Hard Negative Sampling Strategies with Momentum Contrastive Learning for Implicit Hate Speech Detection 05:51: Exploring Speech Foundation Models for Speaker Diarization in Child-Adult Dyadic Interactions 06:53: Exploring Self-Supervised Multi-view Contrastive Learning for Speech Emotion Recognition with Limited Annotations 07:52: Guiding Frame-Level CTC Alignments Using Self-knowledge Distillation 08:55: DeTriever: Decoder-representation-based Retriever for Improving NL2SQL In-Context Learning 10:20: Automated Information Extraction from Thyroid Operation Narrative: A Comparative Study of GPT-4 and Fine-tuned KoELECTRA 11:35: Large Language Model Unlearning via Embedding-Corrupted Prompts 13:17: Defining and Detecting Vulnerability in Human Evaluation Guidelines: A Preliminary Study Towards Reliable NLG Evaluation 14:46: Better than Random: Reliable NLG Human Evaluation with Constrained Active Sampling 16:02: LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning 17:18: Guiding In-Context Learning of LLMs through Quality Estimation for Machine Translation 18:37: It Takes Two: On the Seamlessness between Reward and Policy Model in RLHF 20:02: Adversarial Evasion Attack Efficiency against Large Language Models 21:06: Learning Job Title Representation from Job Description Aggregation Network 21:59: Large Language Models Meet Text-Centric Multimodal Sentiment Analysis: A Survey 23:35: AustroTox: A Dataset for Target-Based Austrian German Offensive Language Detection 24:38: Languages Transferred Within the Encoder: On Representation Transfer in Zero-Shot Multilingual Translation 25:56: Multimodal Table Understanding 27:20: CoXQL: A Dataset for Parsing Explanation Requests in Conversational XAI Systems 28:51: Supportiveness-based Knowledge Rewriting for Retrieval-augmented Language Modeling 30:36: Legend: Leveraging Representation Engineering to Annotate Safety Margin for Preference Datasets 31:57: Semi-Supervised Spoken Language Glossification 33:16: Underneath the Numbers: Quantitative and Qualitative Gender Fairness in LLMs for Depression Prediction 34:37: A Dialogue Game for Eliciting Balanced Collaboration 35:23: Transformer-based Model for ASR N-Best Rescoring and Rewriting 36:16: SumHiS: Extractive Summarization Exploiting Hidden Structure 36:53: Figuratively Speaking: Authorship Attribution via Multi-Task Figurative Language Modeling 38:08: Leveraging Large Language Models for Web Scraping 39:51: M3T: A New Benchmark Dataset for Multi-Modal Document-Level Machine Translation 41:15: Is Programming by Example solved by LLMs? 42:29: Speech Emotion Recognition with ASR Transcripts: A Comprehensive Study on Word Error Rate and Fusion Techniques 43:42: Towards Unsupervised Speech Recognition Without Pronunciation Models 44:50: cPAPERS: A Dataset of Situated and Multimodal Interactive Conversations in Scientific Papers 45:57: Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models 47:02: Tailoring Generative AI Chatbots for Multiethnic Communities in Disaster Preparedness Communication: Extending the CASA Paradigm 48:12: Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL 49:56: TasTe: Teaching Large Language Models to Translate through Self-Reflection 51:28: OLMES: A Standard for Language Model Evaluations 52:47: Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

  4. 268

    Ep. 261 - Part 2 - June 11, 2024

    ArXiv NLP research for Tuesday, June 11, 2024. 00:20: Scientific Computing with Large Language Models 01:08: Speaking Your Language: Spatial Relationships in Interpretable Emergent Communication 02:19: Bilingual Sexism Classification: Fine-Tuned XLM-RoBERTa and GPT-3.5 Few-Shot Learning 03:51: Fine-tuning with HED-IT: The impact of human post-editing for dialogical language models 05:26: Can We Achieve High-quality Direct Speech-to-Speech Translation without Parallel Speech Data? 07:03: Joint Learning of Context and Feedback Embeddings in Spoken Dialogue 07:57: BertaQA: How Much Do Language Models Know About Local Culture? 09:17: MM-KWS: Multi-modal Prompts for Multilingual User-defined Keyword Spotting 10:20: CTC-based Non-autoregressive Textless Speech-to-Speech Translation 11:21: Toxic Memes: A Survey of Computational Perspectives on the Detection and Explanation of Meme Toxicities 13:27: GLIMPSE: Pragmatically Informative Multi-Document Summarization for Scholarly Reviews 14:40: BvSP: Broad-view Soft Prompting for Few-Shot Aspect Sentiment Quad Prediction 16:32: When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models 18:01: Limited Out-of-Context Knowledge Reasoning in Large Language Models 19:36: MINERS: Multilingual Language Models as Semantic Retrievers 20:42: Learning Domain-Invariant Features for Out-of-Context News Detection 22:03: Textual Similarity as a Key Metric in Machine Translation Quality Estimation 23:02: On the Robustness of Document-Level Relation Extraction Models to Entity Name Variations 24:31: Multimodal Belief Prediction 25:29: Advancing Annotation of Stance in Social Media Posts: A Comparative Analysis of Large Language Models and Crowd Sourcing 26:56: Paraphrasing in Affirmative Terms Improves Negation Understanding 27:37: CADS: A Systematic Literature Review on the Challenges of Abstractive Dialogue Summarization 29:38: TextGrad: Automatic "Differentiation" via Text 31:35: Just Because We Camp, Doesn't Mean We Should: The Ethics of Modelling Queer Voices 32:35: THaLLE: Text Hyperlocally Augmented Large Language Extension -- Technical Report 33:51: Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling 35:22: Simple and Effective Masked Diffusion Language Models 36:35: Open-LLM-Leaderboard: From Multi-choice to Open-style Questions for LLMs Evaluation, Benchmark, and Arena

  5. 267

    Ep. 261 - Part 1 - June 11, 2024

    ArXiv NLP research for Tuesday, June 11, 2024. 00:20: A Non-autoregressive Generation Framework for End-to-End Simultaneous Speech-to-Any Translation 01:41: Post-Hoc Answer Attribution for Grounded and Trustworthy Long Document Comprehension: Task, Insights, and Challenges 02:32: A Probabilistic Framework for LLM Hallucination Detection via Belief Tree Propagation 04:08: Evolving Subnetwork Training for Large Language Models 05:31: Missingness-resilient Video-enhanced Multimodal Disfluency Detection 06:37: Mitigating Boundary Ambiguity and Inherent Bias for Text Classification in the Era of Large Language Models 08:14: Crayon: Customized On-Device LLM via Instant Adapter Blending and Edge-Server Hybrid Inference 09:33: Delving into ChatGPT usage in academic writing through excess vocabulary 10:53: Paying More Attention to Source Context: Mitigating Unfaithful Translations from Large Language Model 12:12: CoEvol: Constructing Better Responses for Instruction Finetuning through Multi-Agent Cooperation 13:26: Effectively Compress KV Heads for LLM 15:00: Benchmarking Trustworthiness of Multimodal Large Language Models: A Comprehensive Study 16:54: Reading Miscue Detection in Primary School through Automatic Speech Recognition 18:09: HalluDial: A Large-Scale Benchmark for Automatic Dialogue-Level Hallucination Evaluation 20:01: DARA: Decomposition-Alignment-Reasoning Autonomous Language Agent for Question Answering over Knowledge Graphs 21:15: Efficiently Exploring Large Language Models for Document-Level Machine Translation with In-context Learning 22:35: Advancing Tool-Augmented Large Language Models: Integrating Insights from Errors in Inference Trees 24:42: Translating speech with just images 25:35: Never Miss A Beat: An Efficient Recipe for Context Window Extension of Large Language Models with Consistent "Middle" Enhancement 26:51: Teaching Language Models to Self-Improve by Learning from Language Feedback 28:25: Merging Improves Self-Critique Against Jailbreak Attacks 29:18: Towards Human-AI Collaboration in Healthcare: Guided Deferral Systems with Large Language Models 30:11: Improving Autoformalization using Type Checking 31:37: Improving Commonsense Bias Classification by Mitigating the Influence of Demographic Terms 33:19: Decipherment-Aware Multilingual Learning in Jointly Trained Language Models 34:20: DUAL-REFLECT: Enhancing Large Language Models for Reflective Translation through Dual Learning Feedback Mechanisms 35:20: On the Hallucination in Simultaneous Machine Translation 36:07: MBBQ: A Dataset for Cross-Lingual Comparison of Stereotypes in Generative LLMs 37:42: Scholarly Question Answering using Large Language Models in the NFDI4DataScience Gateway

  6. 266

    Ep. 260 - June 10, 2024

    ArXiv NLP research for Monday, June 10, 2024. 00:19: Shoulders of Giants: A Look at the Degree and Utility of Openness in NLP Research 00:59: HOLMES: Hyper-Relational Knowledge Graphs for Multi-hop Question Answering using LLMs 02:29: The Curse of Popularity: Popular Entities have Catastrophic Side Effects when Deleting Knowledge from Language Models 03:24: MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models 04:51: A Multidimensional Framework for Evaluating Lexical Semantic Change with Social Science Applications 05:49: Synth-SBDH: A Synthetic Dataset of Social and Behavioral Determinants of Health for Clinical Text 07:10: Efficient k-Nearest-Neighbor Machine Translation with Dynamic Retrieval 09:08: Recurrent Context Compression: Efficiently Expanding the Context Window of LLM 10:35: Enhancing Long-Term Memory using Hierarchical Aggregate Tree for Retrieval Augmented Generation 11:26: Verifiable Generation with Subsentence-Level Fine-Grained Citations 12:36: Comparing Data Augmentation Methods for End-to-End Task-Oriented Dialog Systems 13:55: Building Bridges: A Dataset for Evaluating Gender-Fair Machine Translation into German 15:28: Can I understand what I create? Self-Knowledge Evaluation of Large Language Models 16:28: Language Models Resist Alignment 17:58: LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages 19:27: Learning Fine-Grained Controllability on Speech Generation via Efficient Fine-Tuning 20:27: Combining Embeddings and Domain Knowledge for Job Posting Duplicate Detection 21:37: MaskLID: Code-Switching Language Identification through Iterative Masking 22:49: Multi-Prompting Decoder Helps Better Language Understanding 24:22: Tx-LLM: A Large Language Model for Therapeutics 26:21: Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching 27:43: A Parameter-efficient Language Extension Framework for Multilingual ASR 29:06: MedExQA: Medical Question Answering Benchmark with Multiple Explanations 30:36: Sustained Vowels for Pre- vs Post-Treatment COPD Classification 31:49: MASSW: A New Dataset and Benchmark Tasks for AI-Assisted Scientific Workflows 33:40: Symmetric Dot-Product Attention for Efficient Training of BERT Language Models 35:00: Annotation alignment: Comparing LLM and human annotations of conversational safety 36:07: mHuBERT-147: A Compact Multilingual HuBERT Model 37:27: Should We Fine-Tune or RAG? Evaluating Different Techniques to Adapt LLMs for Dialogue 39:00: INTERSPEECH 2009 Emotion Challenge Revisited: Benchmarking 15 Years of Progress in Speech Emotion Recognition 40:06: Meta Learning Text-to-Speech Synthesis in over 7000 Languages 40:59: Controlling Emotion in Text-to-Speech with Natural Language Prompts 41:55: Language Models are Alignable Decision-Makers: Dataset and Application to the Medical Triage Domain 43:29: Multimodal Contextualized Semantic Parsing from Speech 44:25: Interpretability of Language Models via Task Spaces 45:45: Evaluating the Retrieval Component in LLM-Based Question Answering Systems 46:52: Reasoning in Token Economies: Budget-Aware Evaluation of LLM Reasoning Strategies 48:08: Can Language Models Serve as Text-Based World Simulators?

  7. 265

    Ep. 259 - June 9, 2024

    ArXiv NLP research for Sunday, June 09, 2024. 00:19: How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States 01:40: DomainRAG: A Chinese Benchmark for Evaluating Domain-specific Retrieval-Augmented Generation 03:25: Do LLMs Exhibit Human-Like Reasoning? Evaluating Theory of Mind in LLMs for Open-Ended Responses 05:08: MS-HuBERT: Mitigating Pre-training and Inference Mismatch in Masked Language Modelling methods for learning Speech Representations 06:17: SinkLoRA: Enhanced Efficiency and Chat Capabilities for Long-Context Large Language Models 08:11: Peer Review as A Multi-Turn and Long-Context Dialogue with Role-Based Interactions 09:54: MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story Generation 11:20: QGEval: A Benchmark for Question Generation Evaluation 12:44: MrRank: Improving Question Answering Retrieval System through Multi-Result Ranking Model 13:43: Arabic Diacritics in the Wild: Exploiting Opportunities for Improved Diacritization 14:46: The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models 16:30: RE-RAG: Improving Open-Domain QA Performance and Interpretability with Relevance Estimator in Retrieval-Augmented Generation 18:14: Hidden Holes: topological aspects of language models 19:46: Do Prompts Really Prompt? Exploring the Prompt Understanding Capability of Whisper 20:40: Seventeenth-Century Spanish American Notary Records for Fine-Tuning Spanish Large Language Models 22:02: MedREQAL: Examining Medical Knowledge Recall of Large Language Models via Question Answering 23:12: II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Models 25:17: Zero-Shot End-To-End Spoken Question Answering In Medical Domain 26:27: Are Large Language Models Actually Good at Text Style Transfer? 27:32: Feriji: A French-Zarma Parallel Corpus, Glossary & Translator 28:56: TTM-RE: Memory-Augmented Document-Level Relation Extraction 30:12: Why Don't Prompt-Based Fairness Metrics Correlate? 31:27: Hello Again! LLM-powered Personalized Agent for Long-term Dialogue 33:12: Semisupervised Neural Proto-Language Reconstruction 34:12: Prompting Large Language Models with Audio for General-Purpose Speech Summarization 35:14: A Dual-View Approach to Classifying Radiology Reports by Co-Training 36:07: ThaiCoref: Thai Coreference Resolution Dataset

  8. 264

    Ep. 258 - June 8, 2024

    ArXiv NLP research for Saturday, June 08, 2024. 00:19: MemeGuard: An LLM and VLM-based Framework for Advancing Content Moderation via Meme Intervention 01:44: Toward Reliable Ad-hoc Scientific Information Extraction: A Case Study on Two Materials Datasets 02:30: Flexible and Adaptable Summarization via Expertise Separation 04:18: Write Summary Step-by-Step: A Pilot Study of Stepwise Summarization 06:07: CaLM: Contrasting Large and Small Language Models to Verify Grounded Generation 07:23: Venn Diagram Prompting : Accelerating Comprehension with Scaffolding Effect 08:45: VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers 10:19: Planning Like Human: A Dual-process Framework for Dialogue Planning 11:48: Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas 12:57: Recent advancements in computational morphology : A comprehensive survey 14:01: MaTableGPT: GPT-based Table Data Extractor from Materials Science Literature 15:41: Design of reliable technology valuation model with calibrated machine learning of patent indicators 17:08: Fighting Against the Repetitive Training and Sample Dependency Problem in Few-shot Named Entity Recognition 18:59: Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation 20:25: Generalist Multimodal AI: A Review of Architectures, Challenges and Opportunities 21:47: ThatiAR: Subjectivity Detection in Arabic News Sentences 23:07: Do LLMs Recognize me, When I is not me: Assessment of LLMs Understanding of Turkish Indexical Pronouns in Indexical Shift Contexts 24:49: Creativity Has Left the Chat: The Price of Debiasing Language Models 25:57: CERET: Cost-Effective Extrinsic Refinement for Text Generation 27:05: GrowOVER: How Can LLMs Adapt to Growing Real-World Knowledge? 28:07: Video-Language Understanding: A Survey from Model Architecture, Model Training, and Data Perspectives 29:03: ATLAS: Improving Lay Summarisation with Attribute-based Control

  9. 263

    Ep. 257 - June 7, 2024

    ArXiv NLP research for Friday, June 07, 2024. 00:19: Key-Element-Informed sLLM Tuning for Document Summarization 01:22: Low-Resource Cross-Lingual Summarization through Few-Shot Learning with Large Language Models 02:42: Large Language Model-guided Document Selection 04:13: More Victories, Less Cooperation: Assessing Cicero's Diplomacy Play 05:24: DiNeR: a Large Realistic Dataset for Evaluating Compositional Generalization 06:43: MATTER: Memory-Augmented Transformer Using Heterogeneous Knowledge Sources 08:01: Mixture-of-Agents Enhances Large Language Model Capabilities 09:09: AICoderEval: Improving AI Domain Code Generation of Large Language Models 11:00: CRAG -- Comprehensive RAG Benchmark 13:04: CRiskEval: A Chinese Multi-Level Risk Evaluation Benchmark Dataset for Large Language Models 14:52: Think out Loud: Emotion Deducing Explanation in Dialogues 16:43: WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild 18:46: SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals 19:58: BERTs are Generative In-Context Learners 20:43: Annotating FrameNet via Structure-Conditioned Language Generation 21:49: Revisiting Catastrophic Forgetting in Large Language Model Tuning 22:43: FedLLM-Bench: Realistic Benchmarks for Federated Learning of Large Language Models 24:33: Do Language Models Exhibit Human-like Structural Priming Effects? 25:27: Uncertainty Aware Learning for Language Model Alignment 26:50: The Russian Legislative Corpus 27:24: ComplexTempQA: A Large-Scale Dataset for Complex Temporal Question Answering 28:53: HateDebias: On the Diversity and Variability of Hate Speech Debiasing 30:29: A Deep Dive into the Trade-Offs of Parameter-Efficient Preference Alignment Techniques 32:00: Sexism Detection on a Data Diet 33:18: XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model 34:21: Through the Thicket: A Study of Number-Oriented LLMs derived from Random Forest Models 35:32: LLM-based speaker diarization correction: A generalizable approach 36:52: TCMD: A Traditional Chinese Medicine QA Dataset for Evaluating Large Language Models 38:10: BAMO at SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense 39:10: Quantifying Geospatial in the Common Crawl Corpus 40:14: MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter 41:47: Language models emulate certain cognitive profiles: An investigation of how predictability measures interact with individual differences 43:19: Compositional Generalization with Grounded Language Models 44:26: Scenarios and Approaches for Situated Natural Language Explanations 46:04: Are Large Language Models More Empathetic than Humans? 47:38: SUMIE: A Synthetic Benchmark for Incremental Entity Summarization 48:52: Multi-Head RAG: Solving Multi-Aspect Problems with LLMs 50:33: An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models

  10. 262

    Ep. 256 - Part 2 - June 6, 2024

    ArXiv NLP research for Thursday, June 06, 2024. 00:20: The syntax-semantics interface in a child's path: A study of 3- to 11-year-olds' elicited production of Mandarin recursive relative clauses 02:17: Ask LLMs Directly, "What shapes your bias?": Measuring Social Bias in Large Language Models 03:39: Explainability and Hate Speech: Structured Explanations Make Social Media Moderators Faster 04:36: Intention and Face in Dialog 05:48: Uncovering Limitations of Large Language Models in Information Seeking from Tables 07:15: Are We Done with MMLU? 08:41: Legal Judgment Reimagined: PredEx and the Rise of Intelligent AI Interpretation in Indian Courts 09:53: Do Language Models Understand Morality? Towards a Robust Detection of Moral Content 11:47: Every Answer Matters: Evaluating Commonsense with Probabilistic Measures 12:49: Towards Understanding Task-agnostic Debiasing Through the Lenses of Intrinsic Bias and Forgetfulness 14:26: Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness 15:35: Confabulation: The Surprising Value of Large Language Model Hallucinations 16:42: DICE: Detecting In-distribution Contamination in LLM's Fine-tuning Phase for Math Reasoning 18:25: Legal Documents Drafting with Fine-Tuned Pre-Trained Large Language Model 19:32: ValueBench: Towards Comprehensively Evaluating Value Orientations and Understanding of Large Language Models 20:50: mCSQA: Multilingual Commonsense Reasoning Dataset with Unified Creation Strategy by Language Models and Humans 22:21: What Do Language Models Learn in Context? The Structured Task Hypothesis 23:38: Rethinking LLM and Linguistic Steganalysis: An Efficient Detection of Strongly Concealed Stego 24:58: BEADs: Bias Evaluation Across Domains 26:41: FairytaleQA Translated: Enabling Educational Question and Answer Generation in Less-Resourced Languages 28:03: Benchmark Data Contamination of Large Language Models: A Survey 29:02: Transformers need glasses! Information over-squashing in language tasks 30:26: Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models 31:58: Characterizing Similarities and Divergences in Conversational Tones in Humans and LLMs by Sampling with People 33:44: ABEX: Data Augmentation for Low-Resource NLU via Expanding Abstract Descriptions 35:19: What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages 36:41: PaCE: Parsimonious Concept Engineering for Large Language Models

  11. 261

    Ep. 256 - Part 1 - June 6, 2024

    ArXiv NLP research for Thursday, June 06, 2024. 00:20: Efficient Knowledge Infusion via KG-LLM Alignment 01:25: NAP^2: A Benchmark for Naturalness and Privacy-Preserving Text Rewriting by Learning from Human 02:34: Character-Level Chinese Dependency Parsing via Modeling Latent Intra-Word Structure 03:30: XL-HeadTags: Leveraging Multimodal Retrieval Augmentation for the Multilingual Generation of News Headlines and Tags 04:59: End-to-End Trainable Soft Retriever for Low-resource Relation Extraction 06:07: Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning 07:37: Improving Zero-Shot Chinese-English Code-Switching ASR with kNN-CTC and Gated Monolingual Datastores 08:52: ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search 10:29: Chaos with Keywords: Exposing Large Language Models Sycophancy to Misleading Keywords and Evaluating Defense Strategies 11:39: Lean Workbook: A large-scale Lean problem set formalized from natural language math problems 12:56: Speculative Decoding via Early-exiting for Faster LLM Inference with Thompson Sampling Control Mechanism 14:18: Performance of large language models in numerical vs. semantic medical knowledge: Benchmarking on evidence-based Q&As 16:24: Recovering document annotations for sentence-level bitext 17:40: BLSP-Emo: Towards Empathetic Large Speech-Language Models 19:01: Decoder-only Streaming Transformer for Simultaneous Translation 20:28: Evaluating the IWSLT2023 Speech Translation Tasks: Human Annotations, Automatic Metrics, and Segmentation 21:53: Spontaneous Speech-Based Suicide Risk Detection Using Whisper and Large Language Models 23:06: How Good is Zero-Shot MT Evaluation for Low Resource Indian Languages? 24:13: HeSum: a Novel Dataset for Abstractive Text Summarization in Hebrew 25:19: ArMeme: Propagandistic Content in Arabic Memes 26:26: Culturally Aware and Adapted NLP: A Taxonomy and a Survey of the State of the Art 27:11: UltraMedical: Building Specialized Generalists in Biomedicine 28:43: Tox-BART: Leveraging Toxicity Attributes for Explanation Generation of Implicit Hate Speech 30:02: A + B: A General Generator-Reader Framework for Optimizing LLMs to Unleash Synergy Potential 31:29: On The Persona-based Summarization of Domain-Specific Documents 33:14: Assessing LLMs for Zero-shot Abstractive Summarization Through the Lens of Relevance Paraphrasing 34:28: American Sign Language Handshapes Reflect Pressures for Communicative Efficiency

  12. 260

    Ep. 255 - June 5, 2024

    ArXiv NLP research for Wednesday, June 05, 2024. 00:19: Improving In-Context Learning with Prediction Feedback for Sentiment Analysis 01:24: MultifacetEval: Multifaceted Evaluation to Probe LLMs in Mastering Medical Knowledge 03:01: Text Injection for Neural Contextual Biasing 04:16: 4D ASR: Joint Beam Search Integrating CTC, Attention, Transducer, and Mask Predict Decoders 06:03: Adversarial Moment-Matching Distillation of Large Language Models 07:05: Docs2KG: Unified Knowledge Graph Construction from Heterogeneous Documents Assisted by Large Language Models 08:48: Readability-guided Idiom-aware Sentence Simplification (RISS) for Chinese 09:56: Evaluation of data inconsistency for multi-modal sentiment analysis 10:55: BadAgent: Inserting and Activating Backdoor Attacks in LLM Agents 12:11: Unveiling Selection Biases: Exploring Order and Token Sensitivity in Large Language Models 13:16: From Tarzan to Tolkien: Controlling the Language Proficiency Level of LLMs for Content Generation 14:20: StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task Learning 15:42: RadBARTsum: Domain Specific Adaption of Denoising Sequence-to-Sequence Models for Abstractive Radiology Report Summarization 17:00: Towards Detecting LLMs Hallucination via Markov Chain-based Multi-agent Debate Framework 18:14: Cryptocurrency Frauds for Dummies: How ChatGPT introduces us to fraud? 19:48: FragRel: Exploiting Fragment-level Relations in the External Memory of Large Language Models 20:59: Space Decomposition for Sentence Embedding 22:00: Towards Real-world Scenario: Imbalanced New Intent Discovery 23:40: Which Side Are You On? A Multi-task Dataset for End-to-End Argument Summarisation and Evaluation 25:20: CSS: Contrastive Semantic Similarity for Uncertainty Quantification of LLMs 27:03: StatBot.Swiss: Bilingual Open Data Exploration in Natural Language 28:10: Missci: Reconstructing Fallacies in Misrepresented Science 29:43: ChatLang-8: An LLM-Based Synthetic Data Generation Framework for Grammatical Error Correction 30:47: Linking Named Entities in Diderot's \textit{Encyclop\'edie} to Wikidata 32:06: Error-preserving Automatic Speech Recognition of Young English Learners' Language 33:37: Document-level Claim Extraction and Decontextualisation for Fact-Checking 34:45: The Challenges of Evaluating LLM Applications: An Analysis of Automated, Human, and LLM-Based Approaches 36:09: LLM-based Rewriting of Inappropriate Argumentation using Reinforcement Learning from Machine Feedback 37:39: IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models 39:46: Automating Turkish Educational Quiz Generation Using Large Language Models 41:34: Cycles of Thought: Measuring LLM Confidence through Stable Explanations 42:57: Are language models rational? The case of coherence norms and belief revision 43:58: What is the Best Way for ChatGPT to Translate Poetry? 45:20: Using Synchronic Definitions and Semantic Relations to Classify Semantic Change Types 46:14: MODABS: Multi-Objective Learning for Dynamic Aspect-Based Summarization 47:09: BIPED: Pedagogically Informed Tutoring System for ESL Education 48:24: Analyzing LLM Behavior in Dialogue Summarization: Unveiling Circumstantial Hallucination Trends 50:00: Wings: Learning Multimodal LLMs without Text-only Forgetting

  13. 259

    Ep. 254 - Part 2 - June 4, 2024

    ArXiv NLP research for Tuesday, June 04, 2024. 00:20: Description Boosting for Zero-Shot Entity and Relation Classification 01:44: Modeling Emotional Trajectories in Written Stories Utilizing Transformers and Weakly-Supervised Learning 03:09: Enhancing Retrieval-Augmented LMs with a Two-stage Consistency Learning Compressor 04:30: Prompting Large Language Models with Human Error Markings for Self-Correcting Machine Translation 05:41: mCoT: Multilingual Instruction Tuning for Reasoning Consistency in Language Models 06:53: Technical Language Processing for Telecommunications Specifications 08:09: On Affine Homotopy between Language Encoders 09:25: Translation Deserves Better: Analyzing Translation Artifacts in Cross-lingual Visual Question Answering 10:32: Probing the Category of Verbal Aspect in Transformer Language Models 11:58: Linguistic Fingerprint in Transformer Models: How Language Variation Influences Parameter Selection in Irony Detection 13:03: LlamaCare: A Large Medical Language Model for Enhancing Healthcare Knowledge Sharing 14:33: Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs 15:51: On the Intrinsic Self-Correction Capability of LLMs: Uncertainty and Latent Concept 17:30: Multiple Choice Questions and Large Languages Models: A Case Study with Fictional Medical Data 19:08: The Scandinavian Embedding Benchmarks: Comprehensive Assessment of Multilingual and Monolingual Text Embedding 20:07: Representations as Language: An Information-Theoretic Framework for Interpretability 21:32: Analyzing Temporal Complex Events with Large Language Models? A Benchmark towards Temporal, Long Context Understanding 22:46: Hiding Text in Large Language Models: Introducing Unconditional Token Forcing Confusion 24:21: Language-Universal Speech Attributes Modeling for Zero-Shot Multilingual Spoken Keyword Recognition 25:37: Deterministic Reversible Data Augmentation for Neural Machine Translation 26:39: CheckEmbed: Effective Verification of LLM Solutions to Open-Ended Tasks 28:14: Scalable MatMul-free Language Modeling 30:03: SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices 31:37: Mitigate Position Bias in Large Language Models via Scaling a Single Dimension 33:10: TopViewRS: Vision-Language Models as Top-View Spatial Reasoners

  14. 258

    Ep. 254 - Part 1 - June 4, 2024

    ArXiv NLP research for Tuesday, June 04, 2024. 00:20: Conditional Language Learning with Context 01:13: Zyda: A 1.3T Dataset for Open Language Modeling 02:32: RKLD: Reverse KL-Divergence-based Knowledge Distillation for Unlearning Personal Information in Large Language Models 03:50: Personalized Topic Selection Model for Topic-Grounded Dialogue 05:20: Position Debiasing Fine-Tuning for Causal Perception in Long-Term Dialogue 06:58: Phonetic Enhanced Language Modeling for Text-to-Speech Synthesis 08:03: Why Would You Suggest That? Human Trust in Language Model Responses 09:10: Multimodal Reasoning with Multimodal Knowledge Graph 10:30: QROA: A Black-Box Query-Response Optimization Attack on LLMs 11:55: Analyzing Social Biases in Japanese Large Language Models 12:52: I've got the "Answer"! Interpretation of LLMs Hidden States in Question Answering 13:47: PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling 15:16: Assessing the Performance of Chinese Open Source Large Language Models in Information Extraction Tasks 16:38: LongSSM: On the Length Extension of State-space Models in Language Modelling 17:30: Exploring Mathematical Extrapolation of Large Language Models with Synthetic Data 18:40: MARS: Benchmarking the Metaphysical Reasoning Abilities of Language Models with a Multi-task Evaluation Dataset 20:19: UniOQA: A Unified Framework for Knowledge Graph Question Answering with Large Language Models 22:03: Diver: Large Language Model Decoding with Span-Level Mutual Information Verification 23:12: SimulTron: On-Device Simultaneous Speech to Speech Translation 24:28: The current status of large language models in summarizing radiology report impressions 26:10: Reinforcement Tuning for Detecting Stances and Debunking Rumors Jointly with Large Language Models 27:17: Synergetic Event Understanding: A Collaborative Approach to Cross-Document Event Coreference Resolution with Large Language Models 28:46: A multilingual dataset for offensive language and hate speech detection for hausa, yoruba and igbo languages 29:40: FedMKT: Federated Mutual Knowledge Transfer for Large and Small Language Models 31:17: Self-Modifying State Modeling for Simultaneous Machine Translation

  15. 257

    Ep. 253 - June 3, 2024

    ArXiv NLP research for Monday, June 03, 2024. 00:19: Luna: An Evaluation Foundation Model to Catch Language Model Hallucinations with High Accuracy and Low Cost 01:38: Generative Pre-trained Speech Language Model with Efficient Hierarchical Transformer 03:06: Selectively Answering Visual Questions 04:11: Take its Essence, Discard its Dross! Debiasing for Toxic Language Detection via Counterfactual Causal Effect 05:36: Predicting Drug-Gene Relations via Analogy Tasks with Word Embeddings 06:51: SemCoder: Training Code Language Models with Comprehensive Semantics 08:39: Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration 10:26: Combining Qualitative and Computational Approaches for Literary Analysis of Finnish Novels 11:45: Strengthened Symbol Binding Makes Large Language Models Reliable Multiple-Choice Selectors 13:26: Decompose, Enrich, and Extract! Schema-aware Event Extraction using LLMs 14:34: MACT: Model-Agnostic Cross-Lingual Training for Discourse Representation Structure Parsing 15:48: Guiding ChatGPT to Generate Salient Domain Summaries 17:51: Synergizing Unsupervised and Supervised Learning: A Hybrid Approach for Accurate Natural Language Task Modeling 19:30: TCMBench: A Comprehensive Benchmark for Evaluating Large Language Models in Traditional Chinese Medicine 21:38: Explore then Determine: A GNN-LLM Synergy Framework for Reasoning over Knowledge Graph 22:51: Two Tales of Persona in LLMs: A Survey of Role-Playing and Personalization 24:08: Are AI-Generated Text Detectors Robust to Adversarial Perturbations? 25:42: Automatic Essay Multi-dimensional Scoring with Fine-tuning and Multiple Regression 26:35: Improving Pseudo Labels with Global-Local Denoising Framework for Cross-lingual Named Entity Recognition 28:01: Demonstration Augmentation for Zero-shot In-context Learning 29:31: EffiQA: Efficient Question-Answering with Strategic Multi-Model Collaboration on Knowledge Graphs 31:05: Towards Scalable Automated Alignment of LLMs: A Survey 32:19: EduNLP: Towards a Unified and Modularized Library for Educational Resources 33:44: Focus on the Core: Efficient Attention via Pruned Token Compression for Document Classification 35:07: Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses 36:36: When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMs 37:58: CodeR: Issue Resolving with Multi-Agent and Task Graphs 38:54: Unsupervised Distractor Generation via Large Language Model Distilling and Counterfactual Contrastive Decoding 40:10: FactGenius: Combining Zero-Shot Prompting and Fuzzy Relation Mining to Improve Fact Verification with Knowledge Graphs 41:27: Probing Language Models for Pre-training Data Detection 42:45: R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models 44:32: Privacy in LLM-based Recommendation: Recent Advances and Future Directions 45:23: Linguistic Analysis, Description, and Typological Exploration with Categorial Grammar (TheBench Guide) 46:52: D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models 48:52: Do Large Language Models Perform the Way People Expect? Measuring the Human Generalization Function 50:07: Sparsity-Accelerated Training for Large Language Models 51:36: Superhuman performance in urology board questions by an explainable large language model enabled for context integration of the European Association of Urology guidelines: the UroBot study 53:34: Editing the Mind of Giants: An In-Depth Exploration of Pitfalls of Knowledge Editing in Large Language Models 54:42: LexMatcher: Dictionary-centric Data Collection for LLM-based Machine Translation 55:55: Enabling ASR for Low-Resource Languages: A Comprehensive Dataset Creation Approach 57:10: Understanding Token Probability Encoding in Output Embeddings

  16. 256

    Ep. 252 - June 2, 2024

    ArXiv NLP research for Sunday, June 02, 2024. 00:19: Prompt Framework for Role-playing: Generation and Evaluation 01:05: Transforming Computer Security and Public Trust Through the Exploration of Fine-Tuning Large Language Models 02:18: Enhancing Zero-shot Text-to-Speech Synthesis with Human Feedback 03:54: Presence or Absence: Are Unknown Word Usages in Dictionaries? 05:09: Topic Modeling for Short Texts with Large Language Models 06:09: How well do distributed representations convey contextual lexical semantics: a Thesis Proposal 07:05: Evaluating Mathematical Reasoning of Large Language Models: A Focus on Error Identification and Correction 08:27: Automatic Instruction Evolving for Large Language Models 09:25: Applying Intrinsic Debiasing on Downstream Tasks: Challenges and Considerations for Machine Translation 10:26: Developing an efficient corpus using Ensemble Data cleaning approach 11:51: BoNBoN Alignment for Large Language Models and the Sweetness of Best-of-n Sampling 13:15: FOCUS: Forging Originality through Contrastive Use in Self-Plagiarism for Language Models 14:51: The Power of Summary-Source Alignments 16:11: Formality Style Transfer in Persian 17:39: Show, Don't Tell: Aligning Language Models with Demonstrated Feedback 19:08: YODAS: Youtube-Oriented Dataset for Audio and Speech 20:13: MEDIQ: Question-Asking LLMs for Adaptive and Reliable Medical Reasoning 22:15: A Survey of Useful LLM Evaluation 23:31: Unveil the Duality of Retrieval-Augmented Generation: Theoretical Analysis and Practical Solution 25:07: Annotation Guidelines-Based Knowledge Augmentation: Towards Enhancing Large Language Models for Educational Text Classification 27:18: Using RL to Identify Divisive Perspectives Improves LLMs Abilities to Identify Communities on Social Media

  17. 255

    Ep. 251 - June 1, 2024

    ArXiv NLP research for Saturday, June 01, 2024. 00:19: Multi-Dimensional Optimization for Text Summarization via Reinforcement Learning 01:41: CASE: Curricular Data Pre-training for Building Generative and Discriminative Assistive Psychology Expert Models 03:25: Beyond Metrics: Evaluating LLMs' Effectiveness in Culturally Nuanced, Low-Resource Real-World Scenarios 05:03: RoBERTa-BiLSTM: A Context-Aware Hybrid Model for Sentiment Analysis 07:09: The Best of Both Worlds: Toward an Honest and Helpful Large Language Model 09:02: Gender Bias Detection in Court Decisions: A Brazilian Case Study 10:41: Prompt Chaining or Stepwise Prompt? Refinement in Text Summarization 11:54: A Survey on Large Language Models for Code Generation 13:43: Guiding and Diversifying LLM-Based Story Generation via Answer Set Programming 14:46: SPAGHETTI: Open-Domain Question Answering from Heterogeneous Data Sources with Retrieval and Semantic Parsing 15:43: LongSkywork: A Training Recipe for Efficiently Extending Context Length in Large Language Models 17:24: LLMs Could Autonomously Learn Without External Supervision

  18. 254

    Ep. 250 - May 31, 2024

    ArXiv NLP research summaries for May 31, 2024. 00:20 FineRadScore: A Radiology Report Line-by-Line Evaluation Technique Generating Corrections with Severity Scores 01:37 Leveraging Large Language Models for Entity Matching 02:27 Reward-based Input Construction for Cross-document Relation Extraction 03:40 Passage-specific Prompt Tuning for Passage Reranking in Question Answering with Large Language Models 05:04 DORY: Deliberative Prompt Recovery for LLM 06:18 Unveiling the Lexical Sensitivity of LLMs: Combinatorial Optimization for Prompt Enhancement 07:35 It is Simple Sometimes: A Study On Improving Aspect-Based Sentiment Analysis Performance 08:59 FinGen: A Dataset for Argument Generation in Finance 09:42 Improving code-mixed hate detection by native sample mixing: A case study for Hindi-English code-mixed scenario 11:26 Multilingual Text Style Transfer: Datasets & Models for Indian Languages 13:01 An iterated learning model of language change that mixes supervised and unsupervised learning 14:01 Self-Augmented Preference Optimization: Off-Policy Paradigms for Language Model Alignment 15:29 That's Optional: A Contemporary Exploration of "that" Omission in English Subordinate Clauses 16:18 Don't Buy it! Reassessing the Ad Understanding Abilities of Contrastive Multimodal Models 17:20 Improving Reward Models with Synthetic Critiques 18:29 Towards Spoken Language Understanding via Multi-level Multi-grained Contrastive Learning 19:49 clembench-2024: A Challenging, Dynamic, Complementary, Multilingual Benchmark and Underlying Flexible Framework for LLMs as Multi-Action Agents 21:05 A comparison of correspondence analysis with PMI-based word embedding methods 22:05 Large Language Models: A New Approach for Privacy Policy Analysis at Scale 23:36 Preemptive Answer "Attacks" on Chain-of-Thought Reasoning 24:22 Learning to Estimate System Specifications in Linear Temporal Logic using Transformers and Mamba 25:48 OR-Bench: An Over-Refusal Benchmark for Large Language Models 27:20 Superlatives in Context: Explicit and Implicit Domain Restrictions for Superlative Frames 28:41 SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales 30:33 Towards a Fluid computer 31:33 You Only Scan Once: Efficient Multi-dimension Sequential Modeling with LightNet 33:01 LACIE: Listener-Aware Finetuning for Confidence Calibration in Large Language Models 35:02 Direct Alignment of Language Models via Quality-Aware Self-Refinement 36:19 Code Pretraining Improves Entity Tracking Abilities of Language Models

  19. 253

    Ep. 249 - May 30, 2024

    ArXiv NLP research summaries for May 30, 2024.

  20. 252

    Ep. 248 - May 29, 2024

    ArXiv NLP research summaries for May 29, 2024.

  21. 251

    Ep. 247 - May 28, 2024

    ArXiv NLP research summaries for May 28, 2024.

  22. 250

    Ep. 246 - May 27, 2024

    ArXiv NLP summaries for May 27, 2024

  23. 249

    Ep. 245 - May 26, 2024

    ArXiv NLP research summaries for May 26, 2024.

  24. 248

    Ep. 244 - May 25, 2024

    ArXiv NLP research summaries for May 25, 2024.

  25. 247

    Ep. 243 - May 24, 2024

    ArXiv NLP research summaries for May 24, 2024.

  26. 246

    Ep. 242 - Part 2 - May 23, 2024

    arXiv NLP research summaries for May 23, 2024. Today's Research Themes (AI-Generated): • Exploring efficient model scaling with Super Tiny Language Models, significantly reducing parameters while maintaining performance. • Advancing unsupervised adaptation in speech recognition with Self-TAught Recognizer, achieving robustness across diverse domains. • Introducing Large Language Models-guided adaptation for Temporal Knowledge Graph Reasoning, offering interpretable and dynamically updated reasoning. • Assessing the impact of inflectional endings in morphological analysis for the Uzbek language, enhancing word-level accuracy. • Investigating universal goal hijacking in LLMs with Semantic-guided Prompt Organization, confirming model vulnerability to targeted responses.

  27. 245

    Ep. 242 - Part 1 - May 23, 2024

    arXiv NLP research summaries for May 23, 2024. Today's Research Themes (AI-Generated): • Exploring efficient model scaling with Super Tiny Language Models, significantly reducing parameters while maintaining performance. • Advancing unsupervised adaptation in speech recognition with Self-TAught Recognizer, achieving robustness across diverse domains. • Introducing Large Language Models-guided adaptation for Temporal Knowledge Graph Reasoning, offering interpretable and dynamically updated reasoning. • Assessing the impact of inflectional endings in morphological analysis for the Uzbek language, enhancing word-level accuracy. • Investigating universal goal hijacking in LLMs with Semantic-guided Prompt Organization, confirming model vulnerability to targeted responses.

  28. 244

    Ep. 241 - May 22, 2024

    arXiv NLP research summaries for May 22, 2024. Today's Research Themes (AI-Generated): • Mosaic Instruction Tuning (Mosaic-IT) enhances LLMs by creating diverse instruction data, significantly reducing training costs. • Cross-subject classifiers and GPT2 word prediction improve P300 spellers, enhancing communication for ALS patients. • Dynamic vocabulary in ASR improves recognition performance for phrases, eliminating subword dependencies. • ByteT5 shows promise in multilingual translation of Biblical texts, potentially serving underrepresented language communities. • Zero-shot Adaptive Post Training Quantization method, AdpQ, improves LLM deployment efficiency without the need for calibration data.

  29. 243

    Ep. 240 - May 21, 2024

    arXiv NLP research summaries for May 21, 2024. Today's Research Themes (AI-Generated): • A new method is proposed for the scalable and precise identification of crucial 'circuits' within large language models using sparse autoencoders. • SirLLM enhances Large Language Models (LLMs) with the ability to maintain extended memory for infinite-length dialogues without fine-tuning. • Pyramid KV cache compression is introduced to significantly increase the throughput and decrease memory usage in LLM inference. • ProtT3, a Protein-to-Text Generation framework, is developed to aid Language Models in understanding and generating information from amino acid sequences. • Self-instruction based fine-tuning is shown to balance fact-checking accuracy and explainability in LLMs, while ensuring data security.

  30. 242

    Ep. 239 - May 20, 2024

    arXiv NLP research summaries for May 20, 2024. Today's Research Themes (AI-Generated): • Advancements in ordinal classification techniques for NLP, focusing on explicit and implicit approaches within pretrained language models. • Multi-agent framework leveraging large language models for translating ultra-long literary texts, introducing innovative evaluation strategies. • Exploration of SEARNN as an alternative training approach for RNNs, demonstrating improved machine translation for low-resourced African languages. • Introduction of CoNLL#, a fine-grained error analysis and corrected test set for improved Named Entity Recognition evaluation. • Intuitive Fine-Tuning method aligns Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) for language model optimization.

  31. 241

    Ep. 238 - May 19, 2024

    arXiv NLP research summaries for May 19, 2024. Today's Research Themes (AI-Generated): • OpenRLHF introduces a scalable RLHF framework for training large language models efficiently with optimized algorithms. • MAML-en-LLM shows significant improvements in in-context learning of LLMs for better adaptation to unseen tasks. • Du-IN model achieves a breakthrough in speech decoding from intracranial neural signals using discrete units-guided mask modeling. • Efficient Prompt Tuning (EPT) enhances prompt tuning for LLMs by reducing training time and increasing performance consistency. • New research explores the predictive capabilities of LLMs within the educational sector and assesses their potential against human expertise.

  32. 240

    Ep. 237 - May 18, 2024

    arXiv NLP research summaries for May 18, 2024. Today's Research Themes (AI-Generated): • Advancing mental health care through the automation of PTSD diagnostics using large language models. • Pioneering the development of a discourse-aware, knowledge-infused model to improve automated disease diagnosis. • Enhancing trust in annotation tasks by identifying LLM-generated content using novel topical embeddings. • Introducing perspectivist NLP datasets that account for diverse worldviews, improving language model generalization. • Developing multi-domain, multilingual lexicon generation to support language diversity in technical fields.

  33. 239

    Ep. 236 - May 17, 2024

    arXiv Computer Vision research summaries for May 17, 2024. Today's Research Themes (AI-Generated): • VLMs safeguarded against patched visual prompt injectors through pixel-wise randomization and SmoothVLM framework • CM-UNet combines CNN and Mamba for efficient semantic segmentation of remote sensing images • LighTDiff employs a lightweight DDPM for enhanced low-light image enhancement in surgical endoscopy • NeRO MLP-based method offers improvements in autonomous driving through accurate road surface reconstruction • SymCode and SymNet introduced to resolve symmetry ambiguity in 6D pose estimation of symmetric objects

  34. 238

    Ep. 235 - May 16, 2024

    arXiv NLP research summaries for May 16, 2024. Today's Research Themes (AI-Generated): • SecureLLM proposes a new secure LLM architecture for handling sensitive data through fine-tuning data silos and user-specific access. • Chameleon presents a mixed-modal early-fusion foundation model offering state-of-the-art image captioning and competitive long-form mixed-modal generation. • Enhancement of multimodal Chain of Thought reasoning through soft negative sampling to reduce hallucination in model outputs is demonstrated. • A study underlines the importance of pre-neural NLP approaches in educational curricula to build foundational understanding despite the dominance of neural methods. • Information Gain Optimized Tokenizer (IGOT) method introduced for domain-adaptive pretraining, offering computational efficiency and customization.

  35. 237

    Ep. 234 - May 15, 2024

    arXiv NLP research summaries for May 15, 2024. Today's Research Themes (AI-Generated): • Novel AMR parser for clinical notes achieves high accuracy on cancer data, demonstrating potential for structured semantic analysis in healthcare. • HumanRankEval (HRE) proposed for evaluating conversational LMs, showcasing effectiveness in ranking model responses aligned with human judgment. • Research reveals BERT's superior performance over traditional methods in identifying online homophobic content, with an open-source dataset contribution. • Study shows word alignment optimization can mitigate hallucination and omission issues in large language model-based machine translation. • Large language models examined for psychological support potential, finding GPT-4 to provide more empathetic responses than Chat-GPT.

  36. 236

    Ep. 233 - May 14, 2024

    arXiv NLP research summaries for May 14, 2024. Today's Research Themes (AI-Generated): • Exploring a novel model for joint extraction of entities and relations with enhanced information interaction in NLP. • Investigating adversarial robustness and countermeasures of multimodal speech-language models. • Introducing Seal-Tools, a self-instruct learning dataset for agent tuning and benchmarking in language models. • Addressing error correction in clinical text using ensembles of large language models and error categorization. • Proposing stylometric watermarks to distinguish between human and large language model-generated texts.

  37. 235

    Ep. 232 - May 13, 2024

    arXiv NLP research summaries for May 13, 2024. Today's Research Themes (AI-Generated): • Improved Text-to-SQL generation via multiple prompts and choice selection outperforms prior in-context learning methods. • Evaluating large language models in medical applications presents unique challenges and opportunities for integration into clinical practice. • Curriculum learning strategies show potential for enhancing Large Language Model performance without scaling model size. • 'MacBehaviour' R package enables behavioral experimentation with various large language models for psychological studies. • Efficient Multi-sample Speculative Decoding 'EMS-SD' accelerates Large Language Models inference processes.

  38. 234

    Ep. 231 - May 12, 2024

    arXiv Computer Vision research summaries for May 12, 2024. Today's Research Themes (AI-Generated): • Enhancing multi-modal machine learning with Meta-learned Cross-modal Knowledge Distillation for improved performance on tasks with missing modalities. • Developing resource-efficient semi-self-supervised domain adaptation techniques for precise agricultural tasks in varying conditions. • Proposing Energy Plan Denoising for stochastic trajectory prediction acknowledging pedestrian intrinsic uncertainties in autonomous driving. • Introducing memory-efficient image processing frameworks for high-resolution vision systems to facilitate detailed object insights. • Demonstrating advanced 3D hand mesh recovery from monocular RGB for improved human-computer interaction in complex environments.

  39. 233

    Ep. 230 - May 11, 2024

    arXiv NLP research summaries for May 11, 2024. Today's Research Themes (AI-Generated): • Combinatoriality in human language reflected through library learning in Chinese writing system evolution. • Code Representation and Execution (CoRE) enabling natural language, pseudo-code, and flow programming via large language models. • Introduction of EmoMix-3L, a novel code-mixed dataset for multilingual emotion detection in Bangla, English, and Hindi. • Piccolo2 sets new benchmark in text embedding with multi-task hybrid loss training for diverse NLP tasks. • AraSpell framework advances Arabic spelling correction using deep learning and artificial data generation.

  40. 232

    Ep. 229 - May 10, 2024

    arXiv NLP research summaries for May 10, 2024. Today's Research Themes (AI-Generated): • SaudiBERT demonstrates superior performance in processing Arabic text in Saudi dialect, outperforming multi-dialectal models. • Automated generation of model and data cards by Large Language Models promotes responsible AI through enhanced documentation. • D-Pruner introduces domain-specific, task-agnostic compression to improve the efficiency of Large Language Models. • Aspect-based summarization of health answers from CQA forums captures diverse opinions and solutions, improving platform usability. • Exploratory study of gender bias in Hindi language technology reveals the importance of context-specific approaches.

  41. 231

    Ep. 228 - May 9, 2024

    arXiv NLP research summaries for May 09, 2024. Today's Research Themes (AI-Generated): • Cline dataset adds human acceptability judgments to English-Hindi code-mixed text, enhancing natural language processing models. • OpenFactCheck introduces a unified factuality evaluation for large language models, aiming to ensure output accuracy. • G-SAP integrates knowledge graphs with language models to improve commonsense reasoning and cross-modal knowledge transfer. • Assessing dialect robustness of language models reveals significant performance disparity across English dialects. • Novel Chain of Attack method exposes vulnerabilities of LLMs in multi-turn dialogues by adjusting attack strategies contextually.

  42. 230

    Ep. 227 - May 8, 2024

    arXiv NLP research summaries for May 08, 2024. Today's Research Themes (AI-Generated): • ACORN dataset provides insights into LLMs' explanation evaluation consistency compared to human raters. • DALK framework enhances LLM capabilities for specialized domain knowledge integration, demonstrated on Alzheimer's Disease. • APrompt4EM introduces augmented prompt tuning, showing promise in low-resource generalized entity matching. • ChuXin is a fully open-source 1.6B parameter language model aimed at increasing transparency and fostering innovation. • Two innovative approaches to NER: human-annotated corpora for Indian languages and prompt-based logical reasoning enhancement.

  43. 229

    Ep. 226 - May 7, 2024

    arXiv NLP research summaries for May 07, 2024. Today's Research Themes (AI-Generated): • Hybrid AI strategy reduces hallucinations in text summarization using GPT refinement process. • Philosophy of cognitive science gains new insights from the progress in deep learning. • GPT models demonstrate robust evaluation potential for transformer-based text summaries. • FlashBack enhances inference efficiency in Retrieval-Augmented Language Modeling (RALM). • PuzzleBen, a weakly supervised benchmark, advances LLMs’ reasoning abilities with minimal human supervision.

  44. 228

    Ep. 225 - May 6, 2024

    arXiv Computer Vision research summaries for May 06, 2024. Today's Research Themes (AI-Generated): • Diffusion models showcase potential for high-quality video generation and modification. • Deep learning frameworks advance multi-parametric estimation in MR imaging. • Enhanced multimodal AI models set new standards in medical data analysis. • Improved cross-modal feature fusion drives RGB-T tracking performance. • Dual-encoder models trained with language corpora improve paraphrased query retrieval.

  45. 227

    Ep. 224 - May 5, 2024

    arXiv NLP research summaries for May 05, 2024. Today's Research Themes (AI-Generated): • Exploring negative emotional stimuli can enhance LLMs' performance on complex tasks. • A novel concept-based RAG framework utilizes AMR to enhance information retrieval in LLMs. • The FairMonitor framework adopts a static-dynamic method for detecting biases in LLMs. • Cultural reasoning in LLMs is improved through tuning with culturally-related instruction datasets. • A fully differentiable MoE architecture for language model pre-training exhibits superior performance.

  46. 226

    Ep. 223 - May 4, 2024

    arXiv NLP research summaries for May 04, 2024. Today's Research Themes (AI-Generated): • A proposed QUEST framework aims to standardize human evaluation of LLMs in healthcare, ensuring safety and reliability. • A new Transformer and BERT-based model advances Vietnamese spelling correction, outperforming Google Docs. • Astro-NER explores LLMs as domain expert annotators, contributing to named entity recognition in astronomy literature. • The Mixat dataset enhances Arabic-English ASR by addressing code-switching in Emirati speech. • Research demonstrates LLMs' potential in historical analysis, unveiling narrative patterns in Holocaust testimonies.

  47. 225

    Ep. 222 - May 3, 2024

    arXiv NLP research summaries for May 03, 2024. Today's Research Themes (AI-Generated): • SGHateCheck introduces a hate speech detection framework tailored to Singapore's linguistic diversity, enhancing content moderation. • SUKHSANDESH, an AI-based sexual education platform for rural India, leverages avatar therapy to increase empathy and understanding. • Research demonstrates how external knowledge and goal-planning improve conversational recommender systems powered by Large Language Models. • An advanced Bi-LSTM model enhances next-word prediction and sentence completion for the Bangla language, reaching high accuracy levels. • The study on DALLMi explores semi-supervised domain adaption in text multi-label classification with Large Language Models.

  48. 224

    Ep. 221 - May 2, 2024

    arXiv NLP research summaries for May 02, 2024. Today's Research Themes (AI-Generated): • IgboAPI dataset advances Igbo language technologies by enriching machine translation and semantic lexicon with multi-dialectal data. • UniGen addresses domain generalization in sentiment analysis through universal zero-shot dataset generation, enhancing small model applicability. • Efficient data generation for dialogue systems is demonstrated by combining large language model prompting with human expertise in MISeD dataset creation. • Challenges in modelling human dialogue acts for grounding communication are analyzed, highlighting the limits of supervised learning-based NLP dialogue models. • The TartuNLP team's first-place win in EvaLatin 2024 showcases the efficacy of Large Language Model-aided annotation for emotion polarity detection in historical Latin texts.

  49. 223

    Ep. 220 - May 1, 2024

    arXiv NLP research summaries for May 01, 2024. Today's Research Themes (AI-Generated): • Advances in large language models (LLMs) emphasize robustness and alignment with human values. • Novel fine-tuning techniques enhance model performance without a significant increase in computational costs. • Multimodal models, incorporating text and imagery, show progress in complex tasks like sarcasm detection and sentiment analysis. • Investigations reveal the persistence of biases in language models, even absent gender-related language in inputs. • Model quantization techniques aim to balance model confidence and performance with efficiency and lower resource consumption.

  50. 222

    Ep. 219 - April 30, 2024

    arXiv NLP research summaries for April 30, 2024. Today's Research Themes (AI-Generated): • HydraLoRA outperforms other Parameter-Efficient Fine-Tuning methods by leveraging asymmetric structures without the need for domain knowledge. • ViTHSD dataset contributes to improved targeted hate speech detection on Vietnamese social media using a hybrid deep learning model. • New benchmark 'Suvach' elevates Hindi question answering by generating datasets specifically crafted for Hindi language models. • Graph Attention Network enhanced by dependency tree structures achieves state-of-the-art performance in aspect and opinion term extraction. • Octopus v4 integrates multiple open-source language models to handle specialized tasks, challenging the dominance of proprietary models like GPT-4.

Type above to search every episode's transcript for a word or phrase. Matches are scoped to this podcast.

Searching…

We're indexing this podcast's transcripts for the first time — this can take a minute or two. We'll show results as soon as they're ready.

No matches for "" in this podcast's transcripts.

Showing of matches

No topics indexed yet for this podcast.

Loading reviews...

ABOUT THIS SHOW

TechcraftingAI NLP brings you daily summaries of the latest arXiv Computation and Language research.

HOSTED BY

Brad Edwards

CATEGORIES

URL copied to clipboard!