Hidden Markov Model Variational Autoencoder for Acoustic Unit Discovery

Janek Ebbers; Jahn Heymann; Lukas Drude; Thomas Glarner; Reinhold Haeb-Umbach; Bhiksha Raj

2017 INTERSPEECH INTERSPEECH 2017

Hidden Markov Model Variational Autoencoder for Acoustic Unit Discovery

Abstract

Variational Autoencoders (VAEs) have been shown to provide efficient neural-network-based approximate Bayesian inference for observation models for which exact inference is intractable. Its extension, the so-called Structured VAE (SVAE) allows inference in the presence of both discrete and continuous latent variables. Inspired by this extension, we developed a VAE with Hidden Markov Models (HMMs) as latent models. We applied the resulting HMM-VAE to the task of acoustic unit discovery in a zero resource scenario. Starting from an initial model based on variational inference in an HMM with Gaussian Mixture Model (GMM) emission probabilities, the accuracy of the acoustic unit discovery could be significantly improved by the HMM-VAE. In doing so we were able to demonstrate for an unsupervised learning task what is well-known in the supervised learning case: Neural networks provide superior modeling power compared to GMMs.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🧭 Keyword Pioneer — acoustic unit discovery

🐣 Hot Topic Early Bird — unsupervised learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

📈 Trend Setter — Self-Supervised Learning

Authors

Janek Ebbers , Jahn Heymann , Lukas Drude , Thomas Glarner , Reinhold Haeb-Umbach , Bhiksha Raj

Topics

Machine Learning > Learning Types > Unsupervised Learning Machine Learning > Optimization & Theory > Bayesian Inference Deep Learning > Models > Variational Inference Speech & Audio > Analysis > Speech Analysis Machine Learning > Bayesian & Probabilistic > Variational Inference Machine Learning > Learning Paradigms > Self-Supervised Learning

Keywords

unsupervised learning bayesian inference hidden markov model gaussian mixture model variational autoencoder acoustic unit discovery zero-resource learning structured vae

Download PDF

Related papers

Description of the Munich-Passau Snore Sound Corpus (MPSSC) 2017

A Study on Replay Attack and Anti-Spoofing for Automatic Speaker Verification 2017

Binaural Reverberant Speech Separation Based on Deep Neural Networks 2017

Building Audio-Visual Phonetically Annotated Arabic Corpus for Expressive Text to Speech 2017

A Comparison of Danish Listeners’ Processing Cost in Judging the Truth Value of Norwegian, Swedish, and English Sentences 2017