Training Keyword Spotting Models on Non-IID Data with Federated Learning

Andrew Hard; Kurt Partridge; Cameron Nguyen; Niranjan Subrahmanya; Aishanee Shah; Pai Zhu; Ignacio Lopez Moreno; Rajiv Mathews

2020 INTERSPEECH INTERSPEECH 2020

Training Keyword Spotting Models on Non-IID Data with Federated Learning

Abstract

We demonstrate that a production-quality keyword-spotting model can be trained on-device using federated learning and achieve comparable false accept and false reject rates to a centrally-trained model. To overcome the algorithmic constraints associated with fitting on-device data (which are inherently non-independent and identically distributed), we conduct thorough empirical studies of optimization algorithms and hyperparameter configurations using large-scale federated simulations. To overcome resource constraints, we replace memory-intensive MTR data augmentation with SpecAugment, which reduces the false reject rate by 56%. Finally, to label examples (given the zero visibility into on-device data), we explore teacher-student training.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🐣 Hot Topic Early Bird — non-iid datum

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Andrew Hard , Kurt Partridge , Cameron Nguyen , Niranjan Subrahmanya , Aishanee Shah , Pai Zhu , Ignacio Lopez Moreno , Rajiv Mathews

Topics

Artificial Intelligence > Learning Paradigms > Federated Learning Artificial Intelligence > Learning Paradigms > Transfer Learning Machine Learning > Application Areas > Privacy Machine Learning > Optimization & Theory > Stochastic Methods

Keywords

federated learning knowledge distillation data augmentation keyword spotting non-iid datum teacher-student learning

Download PDF

Related papers

Memory Controlled Sequential Self Attention for Sound Recognition 2020

Dual Attention in Time and Frequency Domain for Voice Activity Detection 2020

Automatic Prediction of Speech Intelligibility Based on X-Vectors in the Context of Head and Neck Cancer 2020

A Noise Robust Technique for Detecting Vowels in Speech Signals 2020

Joint Detection of Sentence Stress and Phrase Boundary for Prosody 2020