2020
INTERSPEECH
INTERSPEECH 2020
Speech Emotion Recognition with Discriminative Feature Learning
Abstract
The performance of a speech emotion recognition (SER) system heavily relies on the deep feature learned from the speeches. Most state of the art has focused on developing various deep architectures for effective feature learning. In this study, we make the first attempt to explore feature discriminability instead. Based on our SER baseline system, we propose three approaches, two on loss functions and one on combined attentive pooling, to enhance feature discriminability. Evaluations on IEMOCAP database consistently validate the effectiveness of all our proposals. Compared to the baseline system, the proposed three systems demonstrated at least +4.0% absolute improvements in accuracy, with no increment in the total number of parameters.
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Deep Learning, Healthcare & Medicine, Machine Learning, Mathematics & Optimization, Natural Language Processing, Speech & Audio
🌉
Interdisciplinary Bridge
— Machine Learning and Speech & Audio
Topics
Machine Learning > Core Methods > Classification
Machine Learning > Core Methods > Representation Learning
Machine Learning > Optimization & Theory > Loss Functions
Machine Learning > Learning Types > Representation Learning
Machine Learning > Core Methods > Feature Learning
Speech & Audio > Analysis > Speech Analysis