2016 INTERSPEECH INTERSPEECH 2016

Causal Speech Enhancement Combining Data-Driven Learning and Suppression Rule Estimation

Abstract

The problem of single-channel speech enhancement has been traditionally addressed by using statistical signal processing algorithms that are designed to suppress time-frequency regions affected by noise. We study an alternative data-driven approach which uses deep neural networks (DNNs) to learn the transformation from noisy and reverberant speech to clean speech, with a focus on real-time applications which require low-latency causal processing. We examine several structures in which deep learning can be used within an enhancement system. These include end-to-end DNN regression from noisy to clean spectra, as well as less intervening approaches which estimate a suppression gain for each time-frequency bin instead of directly recovering the clean spectral features. We also propose a novel architecture in which the general structure of a conventional noise suppressor is preserved, but the sub-tasks are independently learned and carried out by separate networks. It is shown that DNN-based suppression gain estimation outperforms the regression approach in the causal processing mode and for noise types that are not seen during DNN training.

πŸš€ Conference Pioneer β€” INTERSPEECH 2016
πŸŒ‰ Interdisciplinary Bridge β€” Artificial Intelligence and Deep Learning and Machine Learning
🧭 Keyword Pioneer β€” causal processing
🐝 Cross-Pollinator β€” Artificial Intelligence, Computer Science, Computer Vision, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio
🐣 Hot Topic Early Bird β€” signal processing