Conv-TasSAN: Separative Adversarial Network Based on Conv-TasNet

Chengyun Deng; Yi Zhang; Shiqian Ma; Yongtao Sha; Hui Song; Xiangang Li

2020 INTERSPEECH INTERSPEECH 2020

Conv-TasSAN: Separative Adversarial Network Based on Conv-TasNet

Abstract

Conv-TasNet has showed competitive performance on single-channel speech source separation. In this paper, we investigate to further improve separation performance by optimizing the training mechanism with the same network structure. Motivated by the successful applications of generative adversarial networks (GANs) on speech enhancement tasks, we propose a novel Separative Adversarial Network called Conv-TasSAN, in which the separator is realized by using Conv-TasNet architecture. The discriminator is involved to optimize the separator with respect to specific speech objective metric. It makes the separator network capture the distribution information of speech sources more accurately, and also prevents over-smoothing problems. Experiments on WSJ0-2mix dataset confirm the superior performance of the proposed method over Conv-TasNet in terms of SI-SNR and PESQ improvement.

🌉 Interdisciplinary Bridge — Machine Learning and Speech & Audio

🧭 Keyword Pioneer — separative adversarial network

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Deep Learning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

🐣 Hot Topic Early Bird — source separation

Authors

Chengyun Deng , Yi Zhang , Shiqian Ma , Yongtao Sha , Hui Song , Xiangang Li

Topics

Machine Learning > Learning Types > Adversarial Learning Speech & Audio > Synthesis > Speech Enhancement Machine Learning > Learning Types > Deep Learning Deep Learning > Learning Types > Adversarial Learning

Keywords

adversarial learning speech separation source separation speech enhancement generative adversarial network speech source separation separative adversarial network si-snr improvement

Download PDF

Related papers

Memory Controlled Sequential Self Attention for Sound Recognition 2020

Dual Attention in Time and Frequency Domain for Voice Activity Detection 2020

Automatic Prediction of Speech Intelligibility Based on X-Vectors in the Context of Head and Neck Cancer 2020

A Noise Robust Technique for Detecting Vowels in Speech Signals 2020

Joint Detection of Sentence Stress and Phrase Boundary for Prosody 2020