The Opensesame NIST 2016 Speaker Recognition Evaluation System

Gang Liu; Qi Qian; Zhibin Wang; Qingen Zhao; Tianzhou Wang; Hao Li; Jian Xue; Shenghuo Zhu; Rong Jin; Tuo Zhao

2017 INTERSPEECH INTERSPEECH 2017

The Opensesame NIST 2016 Speaker Recognition Evaluation System

Abstract

Last two decades have witnessed a significant progress in speaker recognition, as evidenced by the improving performance in the speaker recognition evaluations (SRE) hosted by NIST. Despite the progress, only a few research is focused on speaker recognition with short duration and language mismatch condition, which often leads to poor recognition performance. In NIST SRE2016, these concerns were first systematically investigated by the speaker recognition community. In this study, we address these challenges from the viewpoint of feature extraction and modeling. In particular, we improve the robustness of features by combining GMM and DNN based iVector extraction approaches, and improve the reliability of the back-end model by exploiting symmetric SVM that can effectively leverage the unlabeled data. Finally, we introduce distance metric learning to improve the generalization capacity of the development data that is usually in limited size. Then a fusion strategy is adopted to collectively boost the performance. The effectiveness of the proposed scheme for speaker recognition is demonstrated on SRE2016 evaluation data: compared with DNN-iVector PLDA baseline system, our method yields 25.6% relative improvement in terms of min_Cprimary.

🧭 Keyword Pioneer — symmetric support vector machine

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Gang Liu , Qi Qian , Zhibin Wang , Qingen Zhao , Tianzhou Wang , Hao Li , Jian Xue , Shenghuo Zhu , Rong Jin , Tuo Zhao

Topics

Machine Learning > Core Methods > Metric Learning Machine Learning > Application Areas > Domain Adaptation

Keywords

speaker recognition distance metric learning probabilistic linear discriminant analysis symmetric support vector machine

Download PDF

Related papers

Description of the Munich-Passau Snore Sound Corpus (MPSSC) 2017

A Study on Replay Attack and Anti-Spoofing for Automatic Speaker Verification 2017

Binaural Reverberant Speech Separation Based on Deep Neural Networks 2017

Building Audio-Visual Phonetically Annotated Arabic Corpus for Expressive Text to Speech 2017

A Comparison of Danish Listeners’ Processing Cost in Judging the Truth Value of Norwegian, Swedish, and English Sentences 2017