Speech Enhancement with Variance Constrained Autoencoders

D.T. Braithwaite; W. Bastiaan Kleijn

2019 INTERSPEECH INTERSPEECH 2019

Speech Enhancement with Variance Constrained Autoencoders

Abstract

Recent machine learning based approaches to speech enhancement operate in the time domain and have been shown to outperform the classical enhancement methods. Two such models are SE-GAN and SE-WaveNet, both of which rely on complex neural network architectures, making them expensive to train. We propose using the Variance Constrained Autoencoder (VCAE) for speech enhancement. Our model uses a more straightforward neural network structure than competing solutions and is a natural model for the task of speech enhancement. We demonstrate experimentally that the proposed enhancement model outperforms SE-GAN and SE-WaveNet in terms of perceptual quality of enhanced signals.

🧭 Keyword Pioneer — signal quality

🐣 Hot Topic Early Bird — perceptual quality

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

D.T. Braithwaite , W. Bastiaan Kleijn

Topics

Deep Learning > Architectures > Autoencoders Deep Learning > Models > Generative Models

Keywords

speech enhancement variational autoencoder perceptual quality neural network signal quality

Download PDF

Related papers

Using Real-Time Visual Biofeedback for Second Language Instruction 2019

VAE-Based Regularization for Deep Speaker Embedding 2019

End-to-End SpeakerBeam for Single Channel Target Speech Recognition 2019

Attention-Enhanced Connectionist Temporal Classification for Discrete Speech Emotion Recognition 2019

Attentive to Individual: A Multimodal Emotion Recognition Network with Personalized Attention Profile 2019