2020 INTERSPEECH INTERSPEECH 2020

INTERSPEECH 2020 Deep Noise Suppression Challenge: A Fully Convolutional Recurrent Network (FCRN) for Joint Dereverberation and Denoising

Abstract

The Interspeech 2020 Deep Noise Suppression (DNS) Challenge focuses on evaluating low-latency single-channel speech enhancement algorithms under realistic test conditions. Our contribution to the challenge is a method for joint dereverberation and denoising based on complex spectral mask estimation using a fully convolutional recurrent network (FCRN) which relies on a convolutional LSTM layer for temporal modeling. Since the effects of reverberation and noise on perceived speech quality can differ notably, a multi-target loss for controlling the weight on desired dereverberation and denoising is proposed. In the crowdsourced subjective P.808 listening test conducted by the DNS Challenge organizers, the proposed method shows a significant overall improvement of 0.43 MOS points over the DNS Challenge baseline and ranks amongst the top-3 submissions for both realtime and non-realtime tracks of the challenge.

🌉 Interdisciplinary Bridge — Computer Vision and Machine Learning
🧭 Keyword Pioneer — complex spectral mask
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio