2017 INTERSPEECH INTERSPEECH 2017

Speaker Clustering by Iteratively Finding Discriminative Feature Space and Cluster Labels

Abstract

This paper presents a speaker clustering framework by iteratively performing two stages: a discriminative feature space is obtained given a cluster label set, and the cluster label set is updated using a clustering algorithm given the feature space. In the iterations of two stages, the cluster labels may be different from the true labels, and thus the obtained feature space based on the labels may be inaccurately discriminated. However, by iteratively performing above two stages, more accurate cluster labels and more discriminative feature space can be obtained, and finally they are converged. In this research, the linear discriminant analysis is used for discriminating the i-vector feature space, and the variational Bayesian expectation-maximization on Gaussian mixture model is used for clustering the i-vectors. Our iterative clustering framework was evaluated using the database of keyword utterances and compared with the recently-published approaches. In all experiments, the results show that our framework outperforms the other approaches and converges in a few iterations.

🧭 Keyword Pioneer — discriminative feature space
🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio
🌉 Interdisciplinary Bridge — Machine Learning and Speech & Audio