2017 INTERSPEECH INTERSPEECH 2017

An Efficient Phone N-Gram Forward-Backward Computation Using Dense Matrix Multiplication

Abstract

The forward-backward algorithm is commonly used to train neural network acoustic models when optimizing a sequence objective like MMI and sMBR. Recent work on lattice-free MMI training of neural network acoustic models shows that the forward-backward algorithm can be computed efficiently in the probability domain as a series of sparse matrix multiplications using GPUs. In this paper, we present a more efficient way of computing forward-backward using a dense matrix multiplication approach. We do this by exploiting the block-diagonal structure of the n-gram state transition matrix; instead of multiplying large sparse matrices, the proposed method involves a series of smaller dense matrix multiplications, which can be computed in parallel. Efficient implementation can be easily achieved by leveraging on the optimized matrix multiplication routines provided by standard libraries, such as NumPy and TensorFlow. Runtime benchmarks show that the dense multiplication method is consistently faster than the sparse multiplication method (on both CPUs and GPUs), when applied to a 4-gram phone language model. This is still the case even when the sparse multiplication method uses a more compact finite state model representation by excluding unseen n-grams.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization
🧭 Keyword Pioneer — dense matrix multiplication
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio