Blind Speech Signal Quality Estimation for Speaker Verification Systems

Galina Lavrentyeva; Marina Volkova; Anastasia Avdeeva; Sergey Novoselov; Artem Gorlanov; Tseren Andzhukaev; Artem Ivanov; Alexander Kozlov

2020 INTERSPEECH INTERSPEECH 2020

Blind Speech Signal Quality Estimation for Speaker Verification Systems

Abstract

The problem of system performance degradation in mismatched acoustic conditions has been widely acknowledged in the community and is common for different fields. The present state-of-the-art deep speaker embedding models are domain-sensitive. The main idea of the current research is to develop a single method for automatic signal quality estimation, which allows to evaluate short-term signal characteristics. This paper presents a neural network based approach for blind speech signal quality estimation in terms of signal-to-noise ratio (SNR) and reverberation time (RT60), which is able to classify the type of underlying additive noise. Additionally, current research revealed the need for an accurate voice activity detector that performs well in both clean and noisy unseen environments. Therefore a novel neural network VAD based on U-net architecture is presented.The proposed algorithms allow to perform the analysis of NIST, SITW, Voices datasets commonly used for objective comparison of speaker verification systems from the new point of view and consider effective calibration steps to improve speaker recognition quality on them.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Speech & Audio

🧭 Keyword Pioneer — reverberation time

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Galina Lavrentyeva , Marina Volkova , Anastasia Avdeeva , Sergey Novoselov , Artem Gorlanov , Tseren Andzhukaev , Artem Ivanov , Alexander Kozlov

Topics

Deep Learning > Architectures > Neural Networks Speech & Audio > Analysis > Speaker Verification Speech & Audio > Analysis > Speech Analysis Machine Learning > Learning Types > Classification

Keywords

speaker verification signal-to-noise ratio voice activity detection neural network reverberation time signal quality estimation speech signal quality

Download PDF

Related papers

Memory Controlled Sequential Self Attention for Sound Recognition 2020

Dual Attention in Time and Frequency Domain for Voice Activity Detection 2020

Automatic Prediction of Speech Intelligibility Based on X-Vectors in the Context of Head and Neck Cancer 2020

A Noise Robust Technique for Detecting Vowels in Speech Signals 2020

Joint Detection of Sentence Stress and Phrase Boundary for Prosody 2020