Speaker Clustering by Iteratively Finding Discriminative Feature Space and Cluster Labels

Sungrack Yun; Hye Jin Jang; Taesu Kim

2017 INTERSPEECH INTERSPEECH 2017

Speaker Clustering by Iteratively Finding Discriminative Feature Space and Cluster Labels

Abstract

This paper presents a speaker clustering framework by iteratively performing two stages: a discriminative feature space is obtained given a cluster label set, and the cluster label set is updated using a clustering algorithm given the feature space. In the iterations of two stages, the cluster labels may be different from the true labels, and thus the obtained feature space based on the labels may be inaccurately discriminated. However, by iteratively performing above two stages, more accurate cluster labels and more discriminative feature space can be obtained, and finally they are converged. In this research, the linear discriminant analysis is used for discriminating the i-vector feature space, and the variational Bayesian expectation-maximization on Gaussian mixture model is used for clustering the i-vectors. Our iterative clustering framework was evaluated using the database of keyword utterances and compared with the recently-published approaches. In all experiments, the results show that our framework outperforms the other approaches and converges in a few iterations.

🧭 Keyword Pioneer — discriminative feature space

🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

🌉 Interdisciplinary Bridge — Machine Learning and Speech & Audio

Authors

Sungrack Yun , Hye Jin Jang , Taesu Kim

Topics

Machine Learning > Core Methods > Clustering Machine Learning > Core Methods > Representation Learning Machine Learning > Learning Types > Unsupervised Learning Speech & Audio > Analysis > Speech Analysis Machine Learning > Bayesian & Probabilistic > Variational Inference

Keywords

variational bayesian linear discriminant analysis gaussian mixture model speaker clustering discriminative feature space variational bayesian expectation-maximization

Download PDF

Related papers

Description of the Munich-Passau Snore Sound Corpus (MPSSC) 2017

A Study on Replay Attack and Anti-Spoofing for Automatic Speaker Verification 2017

Binaural Reverberant Speech Separation Based on Deep Neural Networks 2017

Building Audio-Visual Phonetically Annotated Arabic Corpus for Expressive Text to Speech 2017

A Comparison of Danish Listeners’ Processing Cost in Judging the Truth Value of Norwegian, Swedish, and English Sentences 2017