2024 INTERSPEECH INTERSPEECH 2024

E-ODN: An Emotion Open Deep Network for Generalised and Adaptive Speech Emotion Recognition

Abstract

Recognising the widest range of emotions possible is a major challenge in the task of Speech Emotion Recognition(SER), especially for complex and mixed emotions. However, due to the limited number of emotional types and uneven distribution of data within existing datasets, current SER models are typically trained and used in a narrow range of emotional types. In this paper, we propose the Emotion Open Deep Network(E-ODN) model to address this issue. Besides, we introduce a novel Open-Set Recognition method that maps sample emotional features into a three-dimensional emotional space. The method can infer unknown emotions and initialise new type weights, enabling the model to dynamically learn and infer emerging emotional types. The empirical results show that our recognition model outperforms the state-of-the-art(SOTA) models in dealing with multi-type unbalanced data, and it can also perform finer-grained emotion recognition.

🧭 Keyword Pioneer — emotional feature mapping
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio