2024 INTERSPEECH INTERSPEECH 2024

Confidence-aware Hypothesis Transfer Networks for Source-Free Cross-Corpus Speech Emotion Recognition

Abstract

The goal of Source-free cross-corpus speech emotion recognition (SER) is to transfer emotion knowledge from source corpus to target one without access to source data. To address this challenge, we develop a novel method named Confidence-aware Hypothesis Transfer Network (CaHTN) including two modules. To be specific, the first module called hypothesis implicit transfer leverages the frozen source classifier (hypothesis) to force target samples to implicitly align the source hypothesis by information maximization. Besides, a bidirectional confident self-training module is designed to exploit not only the positive pseudo label information but also the negative ones for target feature extraction enhancement. To verify its effectiveness, we design twelve source-free cross-corpus SER tasks and conduct extensive experiments on CASIA, EmoDB, EMOVO and eNTERFACE. Experimental results indicate CaHTN obtains state-of-the-art performance in addressing source-free cross-corpus SER.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio