An EM Algorithm for Localizing Multiple Sound Sources in Reverberant Environments

Michael I. Mandel; Daniel P. Ellis; Tony Jebara

2006 NIPS NeurIPS 2006

An EM Algorithm for Localizing Multiple Sound Sources in Reverberant Environments

Abstract

We present a method for localizing and separating sound sources in stereo recordings that is robust to reverberation and does not make any assumptions about the source statistics. The method consists of a probabilistic model of binaural multisource recordings and an expectation maximization algorithm for finding the maximum likelihood parameters of that model. These parameters include distributions over delays and assignments of time-frequency regions to sources. We evaluate this method against two comparable algorithms on simulations of simultaneous speech from two or three sources. Our method outperforms the others in anechoic conditions and performs as well as the better of the two in the presence of reverberation.

🚀 Conference Pioneer — NIPS 2006

🌱 Topic Pioneer — Speaker Verification

🌉 Interdisciplinary Bridge — Machine Learning and Speech & Audio

📈 Trend Setter — Speech Recognition

🧭 Keyword Pioneer — sound source localization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

🐣 Hot Topic Early Bird — source separation

Authors

Michael I. Mandel , Daniel P. Ellis , Tony Jebara

Topics

Machine Learning > Learning Types > Unsupervised Learning Machine Learning > Optimization & Theory > Optimization Machine Learning > Optimization & Theory > Statistical Learning Speech & Audio > Recognition > Speech Recognition Speech & Audio > Analysis > Speaker Verification Machine Learning > Bayesian & Probabilistic > Probabilistic Modeling Speech & Audio > Analysis > Speech Analysis Machine Learning > Core Methods > Optimization Machine Learning > Learning Types > Optimization Speech & Audio > Processing > Speech Enhancement

Keywords

source localization source separation sound source localization em algorithm expectation maximization maximum likelihood binaural recordings sound source separation sound separation probabilistic model reverberant environment binaural audio

Download PDF

Related papers

Temporal Coding using the Response Properties of Spiking Neurons 2006

Parameter Expanded Variational Bayesian Methods 2006

Effects of Stress and Genotype on Meta-parameter Dynamics in Reinforcement Learning 2006

Ordinal Regression by Extended Binary Classification 2006

Blind source separation for over-determined delayed mixtures 2006