2022 COLING COLING 2022

Adaptive Unsupervised Self-training for Disfluency Detection

Abstract

AbstractSupervised methods have achieved remarkable results in disfluency detection. However, in real-world scenarios, human-annotated data is difficult to obtain. Recent works try to handle disfluency detection with unsupervised self-training, which can exploit existing large-scale unlabeled data efficiently. However, their self-training-based methods suffer from the problems of selection bias and error accumulation. To tackle these problems, we propose an adaptive unsupervised self-training method for disfluency detection. Specifically, we re-weight the importance of each training example according to its grammatical feature and prediction confidence. Experiments on the Switchboard dataset show that our method improves 2.3 points over the current SOTA unsupervised method. Moreover, our method is competitive with the SOTA supervised method.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing
🧭 Keyword Pioneer — grammatical feature
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Natural Language Processing, Reinforcement Learning, Speech & Audio