2019
INTERSPEECH
INTERSPEECH 2019
VAE-Based Regularization for Deep Speaker Embedding
Abstract
Deep speaker embedding has achieved state-of-the-art performance in speaker recognition. A potential problem of these embedded vectors (called ‘x-vectors’) are not Gaussian, causing performance degradation with the famous PLDA back-end scoring. In this paper, we propose a regularization approach based on Variational Auto-Encoder (VAE). This model transforms x-vectors to a latent space where mapped latent codes are more Gaussian, hence more suitable for PLDA scoring.
🌉
Interdisciplinary Bridge
— Machine Learning and Speech & Audio
🐣
Hot Topic Early Bird
— gaussian distribution
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio