2021 INTERSPEECH INTERSPEECH 2021

Online Speaker Diarization Equipped with Discriminative Modeling and Guided Inference

Abstract

Despite considerable efforts, online speaker diarization remains an ongoing challenge. In this study, we propose to tackle the challenge from two perspectives, to endow diarization model with discriminability and to rectify less-reliable online inference with guidance. Specifically, based on the current prior art, UIS-RNN, two enhancement approaches are proposed to concretize our motivations. The effectiveness of our proposals is experimentally validated by results on the AMI evaluation set. With substantial relative improvement of 48.7%, our online speaker diarization system significantly outperformed its baseline. More impressively, its performance in terms of diarization error rate is better than most state-of-the-art offline systems.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning
🧭 Keyword Pioneer — guided inference
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio