2019
INTERSPEECH
INTERSPEECH 2019
A Time Delay Neural Network with Shared Weight Self-Attention for Small-Footprint Keyword Spotting
Abstract
Keyword spotting requires a small memory footprint to run on mobile devices. However, previous works still use several hundred thousand parameters to achieve good performance. To address this issue, we propose a time delay neural network with shared weight self-attention for small-footprint keyword spotting. By sharing weights, the parameters of self-attention are reduced but without performance reduction. The publicly available Google Speech Commands dataset is used to evaluate the models. The number of parameters (12K) of our model is 1/20 of state-of-the-art ResNet model (239K). The proposed model achieves an error rate of 4.19% , which is comparable to the ResNet model.
🐣
Hot Topic Early Bird
— parameter efficiency
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio