Maximum a posteriori Speech Enhancement Based on Double Spectrum

Pejman Mowlaee; Daniel Scheran; Johannes Stahl; Sean U.N. Wood; W. Bastiaan Kleijn

2019 INTERSPEECH INTERSPEECH 2019

Maximum a posteriori Speech Enhancement Based on Double Spectrum

Abstract

While the acoustic frequency domain has been widely used for speech enhancement, usage of the modulation domain is less common. In this paper, we investigate single-channel speech enhancement in the recently proposed Double Spectrum (DS) framework and provide insights on the statistical properties of speech and noise in the DS domain. Relying on our statistical analysis in the DS, we derive a maximum a posteriori estimator of speech in the DS domain. By means of experiments, we evaluate the speech enhancement performance of the proposed method and relevant benchmarks in the acoustic frequency and modulation domains and show that the proposed method achieves a good balance between noise attenuation and speech distortion for various SNRs and noise types.

🧭 Keyword Pioneer — noise attenuation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio

🌉 Interdisciplinary Bridge — Artificial Intelligence and Mathematics & Optimization and Speech & Audio

📈 Trend Setter — Probability

🐣 Hot Topic Early Bird — signal processing

Authors

Pejman Mowlaee , Daniel Scheran , Johannes Stahl , Sean U.N. Wood , W. Bastiaan Kleijn

Topics

Artificial Intelligence > Bayesian & Probabilistic > Bayesian Learning Speech & Audio > Synthesis > Speech Enhancement Mathematics & Optimization > Probability

Keywords

speech enhancement signal processing bayesian estimation maximum a posteriori double spectrum noise attenuation

Download PDF

Related papers

Using Real-Time Visual Biofeedback for Second Language Instruction 2019

VAE-Based Regularization for Deep Speaker Embedding 2019

End-to-End SpeakerBeam for Single Channel Target Speech Recognition 2019

Attention-Enhanced Connectionist Temporal Classification for Discrete Speech Emotion Recognition 2019

Attentive to Individual: A Multimodal Emotion Recognition Network with Personalized Attention Profile 2019