Recursive Sound Source Separation with Deep Learning-based Beamforming for Unknown Number of Sources

Hokuto Munakata; Ryu Takeda; Kazunori Komatani

2023 INTERSPEECH INTERSPEECH 2023

Recursive Sound Source Separation with Deep Learning-based Beamforming for Unknown Number of Sources

Abstract

We propose a recursive separation model for an unknown number of sound sources based on deep learning-based beamforming. Recursive separation models have been investigated as a way to separate a mixture signal composed of an unknown number of sources in a single-channel condition. The mixture signal is separated with residual information in a recursive manner. Although the recursive separation model can be extended to a multi-channel condition using a beamforming-based filter, the separation performance is degraded because the beamforming-based filter tends to accumulate estimation errors in the recursions. To address this problem, we introduce a local Gaussian model (LGM)-based recursive separation model. The proposed method mitigates the accumulation of errors by reusing estimated parameters and applying only one filter to the mixture signal. Experimental results show that our proposed method outperforms a separation model using an accumulative filter.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning

🧭 Keyword Pioneer — local gaussian model

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio