2022 INTERSPEECH INTERSPEECH 2022

Low-complex and Highly-performed Binary Residual Neural Network for Small-footprint Keyword Spotting

Abstract

The hardware power-aware Keyword Spotting (KWS) implementation requires small memory footprint, low-complex computation, and high accuracy performances. In this article, three aspects are introduced to satisfy these three stringent requirements. Firstly, a lightweight Binary Residual Neural Network (B-ResNet) is proposed and applied to the small-footprint KWS. The parameters and calculations inside the net-work are greatly downscaled during the binary quantization. Secondly, during the forward propagation, distribution of the binary activation is optimized by our proposed learnable activation function with fix-valued shift initialization. Thirdly, our variable periodic window (PW) for the backward gradient correction (BGC) is also put forward to avoid gradient mismatch and vanishing problems during the back-propagation. These two improvements effectively increase the accuracy performance during the binarization. Our studies in this article are very helpful and promising for the future hardware KWS implementations.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio