2024 INTERSPEECH INTERSPEECH 2024

Hybrid-Diarization System with Overlap Post-Processing for the DISPLACE 2024 Challenge

Abstract

This paper describes our team's collaborative efforts in participating in the Track 1 for Speaker Diarization of the Diarization of Speaker and Language in Conversational Environments (DISPLACE) Challenge 2024. Our submission focuses on creating a diarization system that is robust to noisy conditions, as well as high amounts of overlapped speech. We conduct an exhaustive study on each component of a hybrid system using techniques such as semi-supervised learning, ensemble of several systems and experiment with both a neural overlap detection module, as well as a post-processing technique using an external overlap detection system. Our final system achieves a diarization error rate (DER) of 28.04% on Phase 1 Eval set, representing a relative improvement of 19.33% compared to the baseline DER of 34.76%.

🧭 Keyword Pioneer — neural overlap detection
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Robotics, Speech & Audio