2021 INTERSPEECH INTERSPEECH 2021

Residual Echo and Noise Cancellation with Feature Attention Module and Multi-Domain Loss Function

Abstract

For real-time acoustic echo cancellation in noisy environments, the classical linear adaptive filters (LAFs) can only remove the linear components of acoustic echo. To further attenuate the non-linear echo components and background noise, this paper proposes a deep learning-based residual echo and noise cancellation (RENC) model, where multiple inputs are utilized and weighted by a feature attention module. More specifically, input features extracted from the far-end reference and the echo estimated by the LAF are scaled with time-frequency attention weights, depending on their correlation with the residual interference in LAF’s output. Moreover, a scale-independent mean square error and perceptual loss function are further suggested for training the RENC model. Experimental results validate the efficacy of the proposed feature attention module and multi-domain loss function, which achieve an 8.4%, 14.9% and 29.5% improvement in perceptual evaluation of speech quality (PESQ), scale-invariant signal-to-distortion ratio (SI-SDR) and echo return loss enhancement (ERLE), respectively.

πŸŒ‰ Interdisciplinary Bridge β€” Deep Learning and Speech & Audio
🧭 Keyword Pioneer β€” multi-domain loss
🐝 Cross-Pollinator β€” Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio