2023 INTERSPEECH INTERSPEECH 2023

SOT: Self-supervised Learning-Assisted Optimal Transport for Unsupervised Adaptive Speech Emotion Recognition

Abstract

In cross-domain speech emotion recognition (SER), reducing the global probability distribution distance (GPDD) between different domains plays a crucial role in unsupervised domain adaptation (UDA), which can be naturally measured by optimal transport (OT). However, owing to the large intra-variations of emotion categories, samples distributed in overlap may induce negative transports. Moreover, OT only considers the GPDD and therefore cannot efficiently transport hard-discriminative samples without utilizing the local structures from intra-class distributions. We propose a self-supervised learning (SSL)-assisted optimal transport (SOT) algorithm for cross-domain SER. First, we regularized OT's transport coupling to mitigate negative transports; then, we designed an SSL module to emphasize local intra-class structure to assist OT in capturing those nontransferable acknowledge. Cross-domain SER experimental results showed that SOT dramatically outperformed state-of-the-art UDAs.

🧭 Keyword Pioneer — probability distribution distance
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio
🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization