Specialized Speech Enhancement Model Selection Based on Learned Non-Intrusive Quality Assessment Metric

Ryandhimas E. Zezario; Szu-Wei Fu; Xugang Lu; Hsin-Min Wang; Yu Tsao

2019 INTERSPEECH INTERSPEECH 2019

Specialized Speech Enhancement Model Selection Based on Learned Non-Intrusive Quality Assessment Metric

Abstract

Previous studies have shown that a specialized speech enhancement model can outperform a general model when the test condition is matched to the training condition. Therefore, choosing the correct (matched) candidate model from a set of ensemble models is critical to achieve generalizability. Although the best decision criterion should be based directly on the evaluation metric, the need for a clean reference makes it impractical for employment. In this paper, we propose a novel specialized speech enhancement model selection (SSEMS) approach that applies a non-intrusive quality estimation model, termed Quality-Net, to solve this problem. Experimental results first confirm the effectiveness of the proposed SSEMS approach. Moreover, we observe that the correctness of Quality-Net in choosing the most suitable model increases as input noisy SNR increases, and thus the results of the proposed systems outperform another auto-encoder-based model selection and a general model, particularly under high SNR conditions.

🧭 Keyword Pioneer — non-intrusive metric

🐣 Hot Topic Early Bird — model selection

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ryandhimas E. Zezario , Szu-Wei Fu , Xugang Lu , Hsin-Min Wang , Yu Tsao

Topics

Machine Learning > Core Methods > Classification Machine Learning > Core Methods > Representation Learning Machine Learning > Application Areas > Efficient Computing

Keywords

model selection speech enhancement signal processing ensemble model quality assessment non-intrusive metric neural network

Download PDF

Related papers

Using Real-Time Visual Biofeedback for Second Language Instruction 2019

VAE-Based Regularization for Deep Speaker Embedding 2019

End-to-End SpeakerBeam for Single Channel Target Speech Recognition 2019

Attention-Enhanced Connectionist Temporal Classification for Discrete Speech Emotion Recognition 2019

Attentive to Individual: A Multimodal Emotion Recognition Network with Personalized Attention Profile 2019