An Effective Mutual Mean Teaching Based Domain Adaptation Method for Sound Event Detection

Xu Zheng; Yan Song; Li-Rong Dai; Ian McLoughlin; Lin Liu

2021 INTERSPEECH INTERSPEECH 2021

An Effective Mutual Mean Teaching Based Domain Adaptation Method for Sound Event Detection

Abstract

In this paper, we present a novel mutual mean teaching based domain adaptation (MMT-DA) method for sound event detection (SED) task, which can effectively exploit synthetic data to improve the SED performance. Existing methods simply treat the synthetic data as strongly-labeled data in semi-supervised learning (SSL) framework. Benefiting from the strong labels of synthetic data, superior SED performance can be achieved. However, a distribution mismatch between synthetic and real data raises an evident challenge for domain adaptation (DA). In MMT-DA, convolutional recurrent neural networks (CRNN) learned from different datasets (i.e. total data:real+synthetic, and real data) are exploited for DA. Specifically, mean teacher method using CRNN is employed for utilizing the unlabeled real data. To compensate the domain diversity, an additional domain classifier with gradient reverse layer(GRL) is used for training a mean teacher for total data. The student CRNNs are mutually taught using the soft predictions of unlabeled data obtained from different teachers. Furthermore, a strip pooling based attention module is exploited to model the inter-dependencies between channels and time-frequency dimensions to exploit the structure information. Experimental results on Task4 of DCASE2020 demonstrate the ability of the proposed method, achieving 52.0% F1-score on the validation dataset, which outperforms the winning system’s 50.6%.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🧭 Keyword Pioneer — mutual mean teaching

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Xu Zheng , Yan Song , Li-Rong Dai , Ian McLoughlin , Lin Liu

Topics

Machine Learning > Learning Types > Semi-Supervised Learning Machine Learning > Application Areas > Domain Adaptation Deep Learning > Architectures > Neural Networks

Keywords

semi-supervised learning domain adaptation synthetic datum sound event detection convolutional recurrent neural network mutual mean teaching

Download PDF

Related papers

Energy-Friendly Keyword Spotting System Using Add-Based Convolution 2021

Dialogue Situation Recognition for Everyday Conversation Using Multimodal Information 2021

Using Games to Augment Corpora for Language Recognition and Confusability 2021

A Psychology-Driven Computational Analysis of Political Interviews 2021

The 2020 Personalized Voice Trigger Challenge: Open Datasets, Evaluation Metrics, Baseline System and Results 2021