2020 INTERSPEECH INTERSPEECH 2020

Frame-Level Signal-to-Noise Ratio Estimation Using Deep Learning

Abstract

This study investigates deep learning based signal-to-noise ratio (SNR) estimation at the frame level. We propose to employ recurrent neural networks (RNNs) with long short-term memory (LSTM) in order to leverage contextual information for this task. As acoustic features are important for deep learning algorithms, we also examine a variety of monaural features and investigate feature combinations using Group Lasso and sequential floating forward selection. By replacing LSTM with bidirectional LSTM, the proposed algorithm naturally leads to a long-term SNR estimator. Systematical evaluations demonstrate that the proposed SNR estimators significantly outperform other frame-level and long-term SNR estimators.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Speech & Audio
🧭 Keyword Pioneer — signal-to-noise ratio estimation
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio