Iterative Compression of End-to-End ASR Model Using AutoML

Abhinav Mehrotra; Łukasz Dudziak; Jinsu Yeo; Young-Yoon Lee; Ravichander Vipperla; Mohamed S. Abdelfattah; Sourav Bhattacharya; Samin Ishtiaq; Alberto Gil C.P. Ramos; SangJeong Lee; Daehyun Kim; Nicholas D. Lane

2020 INTERSPEECH INTERSPEECH 2020

Iterative Compression of End-to-End ASR Model Using AutoML

Abstract

Increasing demand for on-device Automatic Speech Recognition (ASR) systems has resulted in renewed interests in developing automatic model compression techniques. Past research have shown that AutoML-based Low Rank Factorization (LRF) technique, when applied to an end-to-end Encoder-Attention-Decoder style ASR model, can achieve a speedup of up to 3.7×, outperforming laborious manual rank-selection approaches. However, we show that current AutoML-based search techniques only work up to a certain compression level, beyond which they fail to produce compressed models with acceptable word error rates (WER). In this work, we propose an iterative AutoML-based LRF approach that achieves over 5× compression without degrading the WER, thereby advancing the state-of-the-art in ASR compression.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Speech & Audio

🧭 Keyword Pioneer — low rank factorization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Abhinav Mehrotra , Łukasz Dudziak , Jinsu Yeo , Young-Yoon Lee , Ravichander Vipperla , Mohamed S. Abdelfattah , Sourav Bhattacharya , Samin Ishtiaq , Alberto Gil C.P. Ramos , SangJeong Lee , Daehyun Kim , Nicholas D. Lane

Topics

Artificial Intelligence > Core AI > Model Compression Speech & Audio > Recognition > Automatic Speech Recognition Speech & Audio > Recognition > Speech Recognition Deep Learning > Optimization & Theory > Model Compression

Keywords

model compression automatic speech recognition automated machine learning end-to-end model word error rate end-to-end speech recognition low rank factorization

Download PDF

Related papers

Memory Controlled Sequential Self Attention for Sound Recognition 2020

Dual Attention in Time and Frequency Domain for Voice Activity Detection 2020

Automatic Prediction of Speech Intelligibility Based on X-Vectors in the Context of Head and Neck Cancer 2020

A Noise Robust Technique for Detecting Vowels in Speech Signals 2020

Joint Detection of Sentence Stress and Phrase Boundary for Prosody 2020