2019 INTERSPEECH INTERSPEECH 2019

Autoencoder-Based Semi-Supervised Curriculum Learning for Out-of-Domain Speaker Verification

Abstract

This study aims to improve the performance of speaker verification system when no labeled out-of-domain data is available. An autoencoder-based semi-supervised curriculum learning scheme is proposed to automatically cluster unlabeled data and iteratively update the corpus during training. This new training scheme allows us to (1) progressively expand the size of training corpus by utilizing unlabeled data and correcting previous labels at run-time; and (2) improve robustness when generalizing to multiple conditions, such as out-of-domain and text-independent speaker verification tasks. It is also discovered that a denoising autoencoder can significantly enhance the clustering accuracy when it is trained on carefully-selected subset of speakers. Our experimental results show a relative reduction of 30%–50% in EER compared to the baseline.

🧭 Keyword Pioneer — out-of-domain generalization
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio
🌉 Interdisciplinary Bridge — Machine Learning and Speech & Audio