2020 INTERSPEECH INTERSPEECH 2020

Noisy-Reverberant Speech Enhancement Using DenseUNet with Time-Frequency Attention

Abstract

Background noise and room reverberation are two major distortions to the speech signal in real-world environments. Each of them degrades speech intelligibility and quality, and their combined effects are especially detrimental. In this paper, we propose a DenseUNet based model for noisy-reverberant speech enhancement, where a novel time-frequency (T-F) attention mechanism is introduced to aggregate contextual information among different T-F units efficiently and a channelwise attention is developed to merge sources of information among different feature maps. In addition, we introduce a normalization-activation strategy to alleviate the performance drop for small batch training. Systematic evaluations demonstrate that the proposed algorithm substantially improves objective speech intelligibility and quality in various noisy-reverberant conditions, and outperforms other related methods.

🧭 Keyword Pioneer — time-frequency attention
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Robotics, Speech & Audio