Supervised Adaptation of Sequence-to-Sequence Speech Recognition Systems using Batch-Weighting

Christian Huber; Juan Hussain; Tuan-Nam Nguyen; Kaihang Song; Sebastian Stüker; Alexander Waibel

2020 AACL AACL 2020

Supervised Adaptation of Sequence-to-Sequence Speech Recognition Systems using Batch-Weighting

Abstract

AbstractWhen training speech recognition systems, one often faces the situation that sufficient amounts of training data for the language in question are available but only small amounts of data for the domain in question. This problem is even bigger for end-to-end speech recognition systems that only accept transcribed speech as training data, which is harder and more expensive to obtain than text data. In this paper we present experiments in adapting end-to-end speech recognition systems by a method which is called batch-weighting and which we contrast against regular fine-tuning, i.e., to continue to train existing neural speech recognition models on adaptation data. We perform experiments using theses techniques in adapting to topic, accent and vocabulary, showing that batch-weighting consistently outperforms fine-tuning. In order to show the generalization capabilities of batch-weighting we perform experiments in several languages, i.e., Arabic, English and German. Due to its relatively small computational requirements batch-weighting is a suitable technique for supervised life-long learning during the life-time of a speech recognition system, e.g., from user corrections.

🚀 Conference Pioneer — AACL 2020

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Speech & Audio

🧭 Keyword Pioneer — multilingual speech

🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio

🐣 Hot Topic Early Bird — model adaptation

Authors

Christian Huber , Juan Hussain , Tuan-Nam Nguyen , Kaihang Song , Sebastian Stüker , Alexander Waibel

Topics

Deep Learning > Architectures > Transformers Speech & Audio > Recognition > Speech Recognition Machine Learning > Learning Types > Supervised Learning Machine Learning > Learning Types > Transfer Learning Deep Learning > Techniques > Transfer Learning

Keywords

domain adaptation speech recognition model adaptation multilingual speech end-to-end model end-to-end speech recognition batch-weighting technique fine-tuning comparison

Download PDF

Related papers

Can Monolingual Pretrained Models Help Cross-Lingual Classification? 2020

Text Simplification with Reinforcement Learning Using Supervised Rewards on Grammaticality, Meaning Preservation, and Simplicity 2020

ISA: An Intelligent Shopping Assistant 2020

Social Media Medical Concept Normalization using RoBERTa in Ontology Enriched Text Similarity Framework 2020

Overcoming Resistance: The Normalization of an Amazonian Tribal Language 2020