A Repetitive Spectrum Learning Framework for Monaural Speech Enhancement in Extremely Low SNR Environments (Student Abstract)

Wenxin Tai

2022 AAAI AAAI 2022

A Repetitive Spectrum Learning Framework for Monaural Speech Enhancement in Extremely Low SNR Environments (Student Abstract)

Abstract

Abstract Monaural speech enhancement (SE) at an extremely low signal-to-noise ratio (SNR) condition is a challenging problem and rarely investigated in previous studies. Most SE methods experience failures in this situation due to three major factors: overwhelmed vocals, expanded SNR range, and short-sighted feature processing modules. In this paper, we present a novel and general training paradigm dubbed repetitive learning (RL). Unlike curriculum learning that focuses on learning multiple different tasks sequentially, RL is more inclined to learn the same content repeatedly where the knowledge acquired in previous stages can be used to facilitate calibrating feature representations. We further propose an RL-based end-to-end SE method named SERL. Experimental results on TIMIT dataset validate the superior performance of our method.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Speech & Audio

🧭 Keyword Pioneer — repetitive learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Wenxin Tai

Topics

Machine Learning > Learning Types > Continual Learning Deep Learning > Architectures > Neural Networks Speech & Audio > Synthesis > Speech Enhancement Machine Learning > Learning Types > Representation Learning Deep Learning > Learning Types > Deep Learning Deep Learning > Learning Types > Representation Learning Speech & Audio > Processing > Speech Enhancement

Keywords

speech enhancement feature representation signal-to-noise ratio monaural speech repetitive learning monaural audio low snr environment spectrum learning extremely low snr

Download PDF

Related papers

Dynamic Spatial Propagation Network for Depth Completion 2022

FedFR: Joint Optimization Federated Framework for Generic and Personalized Face Recognition 2022

Memory-Guided Semantic Learning Network for Temporal Sentence Grounding 2022

AnchorFace: Boosting TAR@FAR for Practical Face Recognition 2022

Parallel and High-Fidelity Text-to-Lip Generation 2022