Efficient Low-Latency Speech Enhancement with Mobile Audio Streaming Networks

Michal Romaniuk; Piotr Masztalski; Karol Piaskowski; Mateusz Matuszewski

2020 INTERSPEECH INTERSPEECH 2020

Efficient Low-Latency Speech Enhancement with Mobile Audio Streaming Networks

Abstract

We propose Mobile Audio Streaming Networks (MASnet) for efficient low-latency speech enhancement, which is particularly suitable for mobile devices and other applications where computational capacity is a limitation. MASnet processes linear-scale spectrograms, transforming successive noisy frames into complex-valued ratio masks which are then applied to the respective noisy frames. MASnet can operate in a low-latency incremental inference mode which matches the complexity of layer-by-layer batch mode. Compared to a similar fully-convolutional architecture, MASnet incorporates depthwise and pointwise convolutions for a large reduction in fused multiply-accumulate operations per second (FMA/s), at the cost of some reduction in SNR.

🧭 Keyword Pioneer — complex-valued ratio mask

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Michal Romaniuk , Piotr Masztalski , Karol Piaskowski , Mateusz Matuszewski

Topics

Speech & Audio > Synthesis > Speech Enhancement

Keywords

deep learning speech enhancement mobile device low latency complex-valued ratio mask

Download PDF

Related papers

Memory Controlled Sequential Self Attention for Sound Recognition 2020

Dual Attention in Time and Frequency Domain for Voice Activity Detection 2020

Automatic Prediction of Speech Intelligibility Based on X-Vectors in the Context of Head and Neck Cancer 2020

A Noise Robust Technique for Detecting Vowels in Speech Signals 2020

Joint Detection of Sentence Stress and Phrase Boundary for Prosody 2020