Stochastic Recurrent Neural Network for Speech Recognition

Jen-Tzung Chien; Chen Shen

2017 INTERSPEECH INTERSPEECH 2017

Stochastic Recurrent Neural Network for Speech Recognition

Abstract

This paper presents a new stochastic learning approach to construct a latent variable model for recurrent neural network (RNN) based speech recognition. A hybrid generative and discriminative stochastic network is implemented to build a deep classification model. In the implementation, we conduct stochastic modeling for hidden states of recurrent neural network based on the variational auto-encoder. The randomness of hidden neurons is represented by the Gaussian distribution with mean and variance parameters driven by neural weights and learned from variational inference. Importantly, the class labels of input speech frames are incorporated to regularize this deep model to sample the informative and discriminative features for reconstruction of classification outputs. We accordingly propose the stochastic RNN (SRNN) to reflect the probabilistic property in RNN classification system. A stochastic error backpropagation algorithm is implemented. The experiments on speech recognition using TIMIT and Aurora4 show the merit of the proposed SRNN.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🧭 Keyword Pioneer — stochastic modeling

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

🐣 Hot Topic Early Bird — hidden state

Authors

Jen-Tzung Chien , Chen Shen

Topics

Machine Learning > Optimization & Theory > Stochastic Processes Deep Learning > Architectures > Neural Networks Deep Learning > Models > Variational Inference Speech & Audio > Recognition > Speech Recognition Machine Learning > Bayesian & Probabilistic > Variational Inference

Keywords

variational inference speech recognition hidden state recurrent neural network stochastic modeling variational auto-encoder stochastic neural network

Download PDF

Related papers

Description of the Munich-Passau Snore Sound Corpus (MPSSC) 2017

A Study on Replay Attack and Anti-Spoofing for Automatic Speaker Verification 2017

Binaural Reverberant Speech Separation Based on Deep Neural Networks 2017

Building Audio-Visual Phonetically Annotated Arabic Corpus for Expressive Text to Speech 2017

A Comparison of Danish Listeners’ Processing Cost in Judging the Truth Value of Norwegian, Swedish, and English Sentences 2017