2023 EMNLP EMNLP 2023

Disfluent Cues for Enhanced Speech Understanding in Large Language Models

Abstract

AbstractIn computational linguistics, the common practice is to β€œclean” disfluent content from spontaneous speech. However, we hypothesize that these disfluencies might serve as more than mere noise, potentially acting as informative cues. We use a range of pre-trained models for a reading comprehension task involving disfluent queries, specifically featuring different types of speech repairs. The findings indicate that certain disfluencies can indeed improve model performance, particularly those stemming from context-based adjustments. However, large-scale language models struggle to handle repairs involving decision-making or the correction of lexical or syntactic errors, suggesting a crucial area for potential improvement. This paper thus highlights the importance of a nuanced approach to disfluencies, advocating for their potential utility in enhancing model performance rather than their removal.

πŸŒ‰ Interdisciplinary Bridge β€” Artificial Intelligence and Deep Learning and Interdisciplinary and Machine Learning and Natural Language Processing and Speech & Audio
🧭 Keyword Pioneer β€” speech repair
🐝 Cross-Pollinator β€” Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio