Disfluent Cues for Enhanced Speech Understanding in Large Language Models

Morteza Rohanian; Farhad Nooralahzadeh; Omid Rohanian; David Clifton; Michael Krauthammer

2023 EMNLP EMNLP 2023

Disfluent Cues for Enhanced Speech Understanding in Large Language Models

Abstract

AbstractIn computational linguistics, the common practice is to “clean” disfluent content from spontaneous speech. However, we hypothesize that these disfluencies might serve as more than mere noise, potentially acting as informative cues. We use a range of pre-trained models for a reading comprehension task involving disfluent queries, specifically featuring different types of speech repairs. The findings indicate that certain disfluencies can indeed improve model performance, particularly those stemming from context-based adjustments. However, large-scale language models struggle to handle repairs involving decision-making or the correction of lexical or syntactic errors, suggesting a crucial area for potential improvement. This paper thus highlights the importance of a nuanced approach to disfluencies, advocating for their potential utility in enhancing model performance rather than their removal.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Interdisciplinary and Machine Learning and Natural Language Processing and Speech & Audio

🧭 Keyword Pioneer — speech repair

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Morteza Rohanian , Farhad Nooralahzadeh , Omid Rohanian , David Clifton , Michael Krauthammer

Topics

Natural Language Processing > Resources & Methods > Large Language Models Interdisciplinary > Linguistics > Computational Linguistics Machine Learning > Learning Types > Supervised Learning Machine Learning > Learning Types > Representation Learning Artificial Intelligence > Core AI > Large Language Models Speech & Audio > Analysis > Speech Analysis Deep Learning > Learning Types > Self-Supervised Learning Artificial Intelligence > Core AI > Natural Language Processing

Keywords

representation learning prompt engineering reading comprehension disfluency detection pre-trained language model speech understanding speech disfluency large language model speech repair context-based adjustment

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023