2024 INTERSPEECH INTERSPEECH 2024

Elucidating Clock-drift Using Real-world Audios In Wireless Mode For Time-offset Insensitive End-to-End Asynchronous Acoustic Echo Cancellation

Abstract

External playback and microphone array devices connected over wireless channels lack a common clock reference. This leads to non-linear asynchronous clock-drift effects with respect to effective echo path. Thereby, causing time-accelerated exacerbation when conventional filter based echo cancellers are used. We delineate clock-drift associated problems quantitatively by utilizing real-world audio streams recorded using various devices in wireless mode. We revisit and compare in situ conventional signal processing and deep neural network methods. We also introduce a novel end-to-end approach for asynchronous acoustic echo cancellation using Convolutional Recurrent Neural Networks and demonstrate state-of-the-art echo suppression performance without the need of time-resynchronization in the real-world observed non-linear clock-drift conditions.

🧭 Keyword Pioneer — end-to-end approach
🐝 Cross-Pollinator — Computer Science, Deep Learning, Machine Learning, Natural Language Processing, Speech & Audio
🌉 Interdisciplinary Bridge — Deep Learning and Speech & Audio