Large Margin Hidden Markov Models for Automatic Speech Recognition

Fei Sha; Lawrence K. Saul

2006 NIPS NeurIPS 2006

Large Margin Hidden Markov Models for Automatic Speech Recognition

Abstract

We study the problem of parameter estimation in continuous density hidden Markov models (CD-HMMs) for automatic speech recognition (ASR). As in support vector machines, we propose a learning algorithm based on the goal of margin maximization. Unlike earlier work on max-margin Markov networks, our approach is specifically geared to the modeling of real-valued observations (such as acoustic feature vectors) using Gaussian mixture models. Unlike previous discriminative frameworks for ASR, such as maximum mutual information and minimum classification error, our framework leads to a convex optimization, without any spurious local minima. The objective function for large margin training of CD-HMMs is defined over a parameter space of positive semidefinite matrices. Its optimization can be performed efficiently with simple gradient-based methods that scale well to large problems. We obtain competitive results for phonetic recognition on the TIMIT speech corpus.

🚀 Conference Pioneer — NIPS 2006

🌱 Topic Pioneer — Adversarial Learning

🌉 Interdisciplinary Bridge — Machine Learning and Speech & Audio

📈 Trend Setter — Adversarial Learning

🧭 Keyword Pioneer — automatic speech recognition

🐣 Hot Topic Early Bird — automatic speech recognition

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors

Fei Sha , Lawrence K. Saul

Topics

Machine Learning > Core Methods > Classification Machine Learning > Learning Types > Adversarial Learning Speech & Audio > Recognition > Automatic Speech Recognition Machine Learning > Learning Types > Metric Learning

Keywords

large margin learning convex optimization large margin automatic speech recognition large margin training phonetic recognition margin maximization hidden markov model gaussian mixture model acoustic feature

Download PDF

Related papers

Temporal Coding using the Response Properties of Spiking Neurons 2006

Parameter Expanded Variational Bayesian Methods 2006

Effects of Stress and Genotype on Meta-parameter Dynamics in Reinforcement Learning 2006

Ordinal Regression by Extended Binary Classification 2006

Blind source separation for over-determined delayed mixtures 2006