2022 INTERSPEECH INTERSPEECH 2022

Online Learning of Open-set Speaker Identification by Active User-registration

Abstract

Registering each user's identity for voice assistants is burdensome and complex for multi-user environments like a household scenario. This is particularly true when the registration needs to happen on-the-fly with a relatively minimum effort. Most of the prior works for speaker identification (SID) do not seamlessly allow the addition of new speakers as these do not support online updates. To deal with such limitation, we introduce a novel online learning approach to open-set SID that can actively register unknown users in the household setting. Based on MPART (Message Passing Adaptive Resonance Theory), our method performs online active semi-supervised learning for open-set SID by using speaking embedding vectors to infer new speakers and request user's identity. Our method progressively improves the overall SID performance without forgetting, making it attractive for many interactive real-world applications. We evaluate our model for the online learning setting of an open-set SID task where new speakers are added on-the-fly, demonstrating its superior performance.

🌉 Interdisciplinary Bridge — Machine Learning and Speech & Audio
🧭 Keyword Pioneer — user registration
🐣 Hot Topic Early Bird — open-set recognition
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio