Optimization of False Acceptance/Rejection Rates and Decision Threshold for End-to-End Text-Dependent Speaker Verification Systems

Victoria Mingote; Antonio Miguel; Dayana Ribas; Alfonso Ortega; Eduardo Lleida

2019 INTERSPEECH INTERSPEECH 2019

Optimization of False Acceptance/Rejection Rates and Decision Threshold for End-to-End Text-Dependent Speaker Verification Systems

Abstract

Currently, most Speaker Verification (SV) systems based on neural networks use Cross-Entropy and/or Triplet loss functions. Despite these functions provide competitive results, they might not fully exploit the system performance, because they are not designed to optimize the verification task considering the performance measures, e.g. the Detection Cost Function (DCF) or the Equal Error Rate (EER). This paper proposes a first approach to this issue through the optimization of a loss function based on the DCF. This mechanism allows the end-to-end system to directly manage the threshold used to compute the ratio between the False Rejection Rate (FRR) and the False Acceptance Rate (FAR). This way connecting the system training directly to the operating point. Results in a text-dependent speaker verification framework, based on neural network super-vectors over the RSR2015 dataset, outperform reference systems using Cross-Entropy and Triplet loss, as well as our previously proposal based on an approximation of the Area Under the Curve ( aAUC).

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🧭 Keyword Pioneer — false rejection rate

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Victoria Mingote , Antonio Miguel , Dayana Ribas , Alfonso Ortega , Eduardo Lleida

Topics

Machine Learning > Optimization & Theory > Loss Functions Deep Learning > Architectures > Neural Networks

Keywords

speaker verification triplet loss false rejection rate false acceptance rate decision threshold neural network super-vector

Download PDF

Related papers

Using Real-Time Visual Biofeedback for Second Language Instruction 2019

VAE-Based Regularization for Deep Speaker Embedding 2019

End-to-End SpeakerBeam for Single Channel Target Speech Recognition 2019

Attention-Enhanced Connectionist Temporal Classification for Discrete Speech Emotion Recognition 2019

Attentive to Individual: A Multimodal Emotion Recognition Network with Personalized Attention Profile 2019