2020 INTERSPEECH INTERSPEECH 2020

Bandpass Noise Generation and Augmentation for Unified ASR

Abstract

Data Simulation is a crucial technique for robust automatic speech recognition (ASR) systems. We develop this work in the scope of data augmentation and improve robustness by generating new bandpass noise resources from an existing noise corpus. We design numerous bandpass filters with varying center frequencies and filter bandwidths, and obtain corresponding bandpass noise samples. We augment our baseline data simulation with bandpass noises to ingest additional robustness and generalization to generic and unknown acoustic scenarios. This work targets ASR robustness to individual subband noises, and improves robustness to unseen real-world noise that can be approximated as a factorial combination of subband noises. We demonstrate our work for a large scale unified ASR task. We obtained 7% word error rate relative reduction (WERR) across unseen acoustic conditions and 11% WERR for kids speech. We also demonstrate generalization to new ASR applications.

🌉 Interdisciplinary Bridge — Machine Learning and Speech & Audio
🧭 Keyword Pioneer — bandpass noise generation
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio