2019 INTERSPEECH INTERSPEECH 2019

Multiple Sound Source Localization with SVD-PHAT

Abstract

This paper introduces a modification of phase transform on singular value decomposition (SVD-PHAT) to localize multiple sound sources. This work aims to improve localization accuracy and keeps the algorithm complexity low for real-time applications. This method relies on multiple scans of the search space, with projection of each low-dimensional observation onto orthogonal subspaces. We show that this method localizes multiple sound sources more accurately than discrete SRP-PHAT, with a reduction in the Root Mean Square Error up to 0.0395 radians.

🧭 Keyword Pioneer — orthogonal subspace
🐣 Hot Topic Early Bird — singular value decomposition
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio