2022 INTERSPEECH INTERSPEECH 2022

Coupled Discriminant Subspace Alignment for Cross-database Speech Emotion Recognition

Abstract

Speech emotion recognition (SER) is a long-standing important research problem in speech signal processing. In practice, the training and test data are often collected in different scenarios, e.g., different languages, different collecting devices, which would severely degrade the recognition performance. To tackle this problem, in this letter, we propose a novel transfer learning algorithm, named coupled discriminant subspace alignment (CDSA), for cross-database SER. In CDSA, we first conduct linear discriminant analysis (LDA) in source and target databases, respectively. Meanwhile, we learn a latent common subspace, where the target samples are represented by the combination of source samples. Furthermore, we align the projection subspace of source and target databases to make the model more robust. Extensive experiments are carried out on four benchmark databases, and the results demonstrate the effectiveness of the proposed method.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning
🧭 Keyword Pioneer — subspace alignment
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio