2024 INTERSPEECH INTERSPEECH 2024

Enhancing Non-Matching Reference Speech Quality Assessment through Dynamic Weight Adaptation

Abstract

Non-Matching Reference (NMR) is a promising approach in Speech Quality Assessment (SQA), enabling the use of references without the need for exact pristine versions of audio signals. However, NMR-SQA often relies on manually fixed weights for its multitask learning components. This approach requires significant expert knowledge and also rigidly assigns the role of each task in supporting training models. Certain tasks may be more beneficial and should be given greater attention at specific stages of the training process. Fixed weights do not accommodate such variations. To address this limitation, we propose an adaptation in NMR-SQA, utilizing a novel probability distribution and success history memory, which allows weights to change dynamically, vary, and provide multiple points to roll back during the training process. Experiments on NISQA test sets demonstrate the efficacy of our approach compared to other advanced methods.

🧭 Keyword Pioneer — dynamic weight adaptation
🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio