2024 INTERSPEECH INTERSPEECH 2024

ASA: An Auditory Spatial Attention Dataset with Multiple Speaking Locations

Abstract

Recent studies have demonstrated the feasibility of localizing an attended sound source from electroencephalography (EEG) signals in a cocktail party scenario. This is referred to as EEG-enabled Auditory Spatial Attention Detection (ASAD). Despite the promise, there is a lack of ASAD datasets. Most existing ASAD datasets are recorded from two speaking locations. To bridge this gap, we introduce a new Auditory Spatial Attention (ASA) dataset, featuring multiple speaking locations of sound sources. The new dataset is designed to challenge and refine deep neural network solutions in real-world applications. Furthermore, we build a channel attention convolutional neural network (CA-CNN) as a reference model for ASA, that serves as a competitive benchmark for future studies.

🧭 Keyword Pioneer — auditory spatial attention
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio
🌉 Interdisciplinary Bridge — Deep Learning and Healthcare & Medicine and Speech & Audio