2024 INTERSPEECH INTERSPEECH 2024

AR-NLU: A Framework for Enhancing Natural Language Understanding Model Robustness against ASR Errors

Abstract

A major challenge with pipeline spoken language understanding systems is that errors in the upstream automatic speech recognition (ASR) engine adversely impact downstream natural language understanding (NLU) models. To address this challenge, we propose an ASR-Robust NLU (AR-NLU) framework that extends a pre-existing NLU model by training it simultaneously on two input streams: human generated or gold transcripts and noisy ASR transcripts. We apply contrastive learning to make the model learn the same representations and predictions for both gold and ASR inputs, thereby enhancing its robustness against ASR noises. To demonstrate the effectiveness of this framework, we present two AR-NLU models: a Robust Intent DEtection (RIDE) and ASR-Robust BI-encoder for NameD Entity Recognition (AR-BINDER). Experimental results show that our proposed AR-NLU framework is applicable to various NLU models and significantly outperforms the original models in both sequence and token classification tasks.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio