2020 INTERSPEECH INTERSPEECH 2020

Continual Learning for Multi-Dialect Acoustic Models

Abstract

Using data from multiple dialects has shown promise in improving neural network acoustic models. While such training can improve the performance of an acoustic model on a single dialect, it can also produce a model capable of good performance on multiple dialects. However, training an acoustic model on pooled data from multiple dialects takes a significant amount of time and computing resources, and it needs to be retrained every time a new dialect is added to the model. In contrast, sequential transfer learning (fine-tuning) does not require retraining using all data, but may result in catastrophic forgetting of previously-seen dialects. Using data from four english dialects, we demonstrate that by using loss functions that mitigate catastrophic forgetting, sequential transfer learning can be used to train multi-dialect acoustic models that narrow the WER gap between the best (combined training) and worst (fine-tuning) case by up to 65%. Continual learning shows great promise in minimizing training time while approaching the performance of models that require much more training time.

🧭 Keyword Pioneer — sequential transfer learning
🐣 Hot Topic Early Bird — continual learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Deep Learning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio