2017 INTERSPEECH INTERSPEECH 2017

Unfolded Deep Recurrent Convolutional Neural Network with Jump Ahead Connections for Acoustic Modeling

Abstract

Recurrent neural networks (RNNs) with jump ahead connections have been used in the computer vision tasks. Still, they have not been investigated well for automatic speech recognition (ASR) tasks. In other words, unfolded RNN has been shown to be an effective model for acoustic modeling tasks. This paper investigates how to elaborate a sophisticated unfolded deep RNN architecture in which recurrent connections use a convolutional neural network (CNN) to model a short-term dependence between hidden states. In this study, our unfolded RNN architecture is a CNN that process a sequence of input features sequentially. Each time step, the CNN inputs a small block of the input features and the output of the hidden layer from the preceding block in order to compute the output of its hidden layer. In addition, by exploiting either one or multiple jump ahead connections between time steps, our network can learn long-term dependencies more effectively. We carried experiments on the CHiME 3 task showing the effectiveness of our proposed approach.

🌉 Interdisciplinary Bridge — Deep Learning and Speech & Audio
🧭 Keyword Pioneer — jump ahead connection
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio