Low-Rank Time-Frequency Synthesis

Cédric Févotte; Matthieu Kowalski

2014 NIPS NeurIPS 2014

Low-Rank Time-Frequency Synthesis

Abstract

Many single-channel signal decomposition techniques rely on a low-rank factorization of a time-frequency transform. In particular, nonnegative matrix factorization (NMF) of the spectrogram -- the (power) magnitude of the short-time Fourier transform (STFT) -- has been considered in many audio applications. In this setting, NMF with the Itakura-Saito divergence was shown to underly a generative Gaussian composite model (GCM) of the STFT, a step forward from more empirical approaches based on ad-hoc transform and divergence specifications. Still, the GCM is not yet a generative model of the raw signal itself, but only of its STFT. The work presented in this paper fills in this ultimate gap by proposing a novel signal synthesis model with low-rank time-frequency structure. In particular, our new approach opens doors to multi-resolution representations, that were not possible in the traditional NMF setting. We describe two expectation-maximization algorithms for estimation in the new model and report audio signal processing results with music decomposition and speech enhancement.

📈 Trend Setter — Speech Enhancement

🧭 Keyword Pioneer — speech enhancement

🐣 Hot Topic Early Bird — speech enhancement

🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Speech & Audio

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Mathematics & Optimization and Speech & Audio

Authors

Cédric Févotte , Matthieu Kowalski

Topics

Deep Learning > Architectures > Neural Networks Speech & Audio > Synthesis > Speech Enhancement Speech & Audio > Analysis > Speech Enhancement Machine Learning > Core Methods > Matrix Factorization Mathematics & Optimization > Mathematics > Signal Processing Speech & Audio > Processing > Speech Enhancement

Keywords

speech enhancement nonnegative matrix factorization time-frequency analysis signal decomposition audio processing short-time fourier transform time-frequency synthesis audio decomposition

Download PDF

Related papers

Information-based learning by agents in unbounded state spaces 2014

Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm 2014

Partition-wise Linear Models 2014

Active Regression by Stratification 2014

Cone-Constrained Principal Component Analysis 2014