Comparative Study of Parametric and Representation Uncertainty Modeling for Recurrent Neural Network Language Models

Jianwei Yu; Max W.Y. Lam; Shoukang Hu; Xixin Wu; Xu Li; Yuewen Cao; Xunying Liu; Helen Meng

2019 INTERSPEECH INTERSPEECH 2019

Comparative Study of Parametric and Representation Uncertainty Modeling for Recurrent Neural Network Language Models

Abstract

Recurrent neural network language models (RNNLMs) have shown superior performance across a range of tasks, including speech recognition. The hidden layer of RNNLMs plays a vital role in learning the suitable representation of contexts for word prediction. However, the deterministic model parameters and fixed hidden vectors in conventional RNNLMs have limited power in modeling the uncertainty over hidden representations. In order to address this issue, in this paper, a comparative study of parametric and hidden representation uncertainty modeling approaches based on Bayesian gates and variational RNNLMs respectively is investigated on long short-term memory (LSTM) and gated recurrent units (GRU) LMs. Experimental results are presented on two tasks: PennTreebank (PTB) corpus, Switchboard conversational telephone speech (SWBD). Consistent performance improvements were obtained over conventional RNNLMs in terms of both perplexity and word error rate.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Speech & Audio

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Jianwei Yu , Max W.Y. Lam , Shoukang Hu , Xixin Wu , Xu Li , Yuewen Cao , Xunying Liu , Helen Meng

Topics

Speech & Audio > Recognition > Speech Recognition Artificial Intelligence > Bayesian & Probabilistic > Bayesian Inference

Keywords

variational inference bayesian inference speech recognition recurrent neural network language model

Download PDF

Related papers

Using Real-Time Visual Biofeedback for Second Language Instruction 2019

VAE-Based Regularization for Deep Speaker Embedding 2019

End-to-End SpeakerBeam for Single Channel Target Speech Recognition 2019

Attention-Enhanced Connectionist Temporal Classification for Discrete Speech Emotion Recognition 2019

Attentive to Individual: A Multimodal Emotion Recognition Network with Personalized Attention Profile 2019