2021 INTERSPEECH INTERSPEECH 2021

Towards an Accent-Robust Approach for ATC Communications Transcription

Abstract

Air Traffic Control (ATC) communications are a typical example where Automatic Speech Recognition could face various challenges: audio data are quite noisy due to the characteristics of capturing mechanisms. All speakers involved use a specific English-based phraseology and a significant number of pilots and controllers are non-native English speakers. The aim of this work is to enhance pilot-ATC communications by adding a Speech to Text (STT) capability that will transcribe ATC speech into text on the cockpit interfaces to help the pilot understand ATC speech in a more optimal manner (be able to verify what he/she heard on the radio by looking at the text transcription, be able to decipher non-native English accents from controllers, not lose time asking the ATC to repeat the message several times). In this paper, we first describe an accent analysis study which was carried out both on a theoretical level but also with the help of feedback from several hundred airline pilots. Then, we present the dataset that was set up for this work. Finally, we describe the experiments we have implemented and the impact of the speaker accent on the performance of a speech to text engine.

🌉 Interdisciplinary Bridge — Machine Learning and Speech & Audio
🧭 Keyword Pioneer — accent robustness
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Natural Language Processing, Reinforcement Learning, Speech & Audio