2016 INTERSPEECH INTERSPEECH 2016

Speech Bandwidth Extension Using Bottleneck Features and Deep Recurrent Neural Networks

Abstract

This paper presents a novel method for speech bandwidth extension (BWE) using deep structured neural networks. In order to utilize linguistic information during the prediction of high-frequency spectral components, the bottleneck (BN) features derived from a deep neural network (DNN)-based state classifier for narrowband speech are employed as auxiliary input. Furthermore, recurrent neural networks (RNNs) incorporating long short-term memory (LSTM) cells are adopted to model the complex mapping relationship between the feature sequences describing low-frequency and high-frequency spectra. Experimental results show that the BWE method proposed in this paper can achieve better performance than the conventional method based on Gaussian mixture models (GMMs) and the state-of-the-art approach based on DNNs in both objective and subjective tests.

πŸš€ Conference Pioneer β€” INTERSPEECH 2016
πŸŒ‰ Interdisciplinary Bridge β€” Deep Learning and Speech & Audio
🧭 Keyword Pioneer β€” speech bandwidth extension
🐝 Cross-Pollinator β€” Artificial Intelligence, Computer Vision, Deep Learning, Machine Learning, Natural Language Processing, Reinforcement Learning, Speech & Audio