2018 INTERSPEECH INTERSPEECH 2018

Fast Variational Bayes for Heavy-tailed PLDA Applied to i-vectors and x-vectors

Abstract

The standard state-of-the-art backend for text-independent speaker recognizers that use i-vectors or x-vectors is Gaussian PLDA (G-PLDA), assisted by a Gaussianization step involving length normalization. G-PLDA can be trained with both gener- ative or discriminative methods. It has long been known that heavy-tailed PLDA (HT-PLDA), applied without length nor- malization, gives similar accuracy, but at considerable extra computational cost. We have recently introduced a fast scor- ing algorithm for a discriminatively trained HT-PLDA back- end. This paper extends that work by introducing a fast, vari- ational Bayes, generative training algorithm. We compare old and new backends, with and without length-normalization, with i-vectors and x-vectors, on SRE’10, SRE’16 and SITW.

🐣 Hot Topic Early Bird — representation learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio
🌉 Interdisciplinary Bridge — Machine Learning and Speech & Audio