Neural Speaker Extraction with Speaker-Speech Cross-Attention Network

Wupeng Wang; Chenglin Xu; Meng Ge; Haizhou Li

2021 INTERSPEECH INTERSPEECH 2021

Neural Speaker Extraction with Speaker-Speech Cross-Attention Network

Abstract

In this paper, we propose a novel time-domain speaker-speech cross-attention network as a variant of SpEx [1] architecture, that features speaker-speech cross-attention. The speaker-speech cross-attention network consists of speech semantic layers that capture the high-level dependency of audio feature, and cross-attention layers that fuse speaker embedding and speech features to estimate the speaker mask. We implement cross-attention layers with both parallel and sequential concatenation techniques. Experiments show that the proposed models consistently outperform the state-of-the-art time-domain speaker extraction baseline on WSJ0-2mix dataset.

🌉 Interdisciplinary Bridge — Deep Learning and Speech & Audio

🧭 Keyword Pioneer — speaker-speech cross-attention

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors

Wupeng Wang , Chenglin Xu , Meng Ge , Haizhou Li

Topics

Deep Learning > Architectures > Transformers Deep Learning > Techniques > Model Architecture Speech & Audio > Analysis > Speaker Verification

Keywords

speaker extraction speaker-speech cross-attention speaker mask audio feature neural network

Download PDF

Related papers

Energy-Friendly Keyword Spotting System Using Add-Based Convolution 2021

Dialogue Situation Recognition for Everyday Conversation Using Multimodal Information 2021

Using Games to Augment Corpora for Language Recognition and Confusability 2021

A Psychology-Driven Computational Analysis of Political Interviews 2021

The 2020 Personalized Voice Trigger Challenge: Open Datasets, Evaluation Metrics, Baseline System and Results 2021