Speaker-Aware Deep Denoising Autoencoder with Embedded Speaker Identity for Speech Enhancement

Fu-Kai Chuang; Syu-Siang Wang; Jeih-Weih Hung; Yu Tsao; Shih-Hau Fang

2019 INTERSPEECH INTERSPEECH 2019

Speaker-Aware Deep Denoising Autoencoder with Embedded Speaker Identity for Speech Enhancement

Abstract

Previous studies indicate that noise and speaker variations can degrade the performance of deep-learning-based speech-enhancement systems. To increase the system performance over environmental variations, we propose a novel speaker-aware system that integrates a deep denoising autoencoder (DDAE) with an embedded speaker identity. The overall system first extracts embedded speaker identity features using a neural network model; then the DDAE takes the augmented features as input to generate enhanced spectra. With the additional embedded features, the speech-enhancement system can be guided to generate the optimal output corresponding to the speaker identity. We tested the proposed speech-enhancement system on the TIMIT dataset. Experimental results showed that the proposed speech-enhancement system could improve the sound quality and intelligibility of speech signals from additive noise-corrupted utterances. In addition, the results suggested system robustness for unseen speakers when combined with speaker features.

🌉 Interdisciplinary Bridge — Machine Learning and Speech & Audio

🧭 Keyword Pioneer — deep denoising autoencoder

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Robotics, Security & Privacy, Speech & Audio

Authors

Fu-Kai Chuang , Syu-Siang Wang , Jeih-Weih Hung , Yu Tsao , Shih-Hau Fang

Topics

Machine Learning > Core Methods > Representation Learning Speech & Audio > Synthesis > Speech Enhancement

Keywords

speech enhancement speaker verification noise reduction speaker identity deep denoising autoencoder

Download PDF

Related papers

Using Real-Time Visual Biofeedback for Second Language Instruction 2019

VAE-Based Regularization for Deep Speaker Embedding 2019

End-to-End SpeakerBeam for Single Channel Target Speech Recognition 2019

Attention-Enhanced Connectionist Temporal Classification for Discrete Speech Emotion Recognition 2019

Attentive to Individual: A Multimodal Emotion Recognition Network with Personalized Attention Profile 2019