Memory-Efficient Modeling and Search Techniques for Hardware ASR Decoders

Michael Price; Anantha Chandrakasan; James Glass

2016 INTERSPEECH INTERSPEECH 2016

Memory-Efficient Modeling and Search Techniques for Hardware ASR Decoders

Abstract

This paper gives an overview of acoustic modeling and search techniques for low-power embedded ASR decoders. Our design decisions prioritize memory bandwidth, which is the main driver in system power consumption. We evaluate three acoustic modeling approaches — Gaussian mixture model (GMM), subspace GMM (SGMM) and deep neural network (DNN) — and identify tradeoffs between memory bandwidth and recognition accuracy. We also present an HMM search scheme with WFST compression and caching, predictive beam width control, and a word lattice. Our results apply to embedded system implementations using microcontrollers, DSPs, FPGAs, or ASICs.

🚀 Conference Pioneer — INTERSPEECH 2016

🧭 Keyword Pioneer — memory bandwidth

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Speech & Audio