2018 INTERSPEECH INTERSPEECH 2018

Angular Softmax for Short-Duration Text-independent Speaker Verification

Abstract

Recently, researchers propose to build deep learning based end-to-end speaker verification (SV) systems and achieve competitive results compared with the standard i-vector approach. In addition to deep learning architectures, optimization metric, such as softmax loss or triplet loss, is important for extracting speaker embeddings which are discriminative and generalizable to unseen speakers. In this paper, angular softmax (A-softmax) loss is introduced to improve speaker embedding quality. It is investigated in two SV frameworks: a CNN based end-to-end SV framework and an i-vector SV framework where deep discriminant analysis is used for channel compensation. Experimental results on a short-duration text-independent speaker verification dataset generated from SRE reveal that A-softmax achieves significant performance improvement compared with other metrics in both frameworks.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning
📈 Trend Setter — Loss Functions
🧭 Keyword Pioneer — angular softmax
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio
🐣 Hot Topic Early Bird — triplet loss