Automatically augmenting an emotion dataset improves classification using audio

Egor Lakomkin; Cornelius Weber; Stefan Wermter

2017 EACL EACL 2017

Automatically augmenting an emotion dataset improves classification using audio

Abstract

AbstractIn this work, we tackle a problem of speech emotion classification. One of the issues in the area of affective computation is that the amount of annotated data is very limited. On the other hand, the number of ways that the same emotion can be expressed verbally is enormous due to variability between speakers. This is one of the factors that limits performance and generalization. We propose a simple method that extracts audio samples from movies using textual sentiment analysis. As a result, it is possible to automatically construct a larger dataset of audio samples with positive, negative emotional and neutral speech. We show that pretraining recurrent neural network on such a dataset yields better results on the challenging EmotiW corpus. This experiment shows a potential benefit of combining textual sentiment analysis with vocal information.

🌉 Interdisciplinary Bridge — Interdisciplinary and Machine Learning and Speech & Audio

🧭 Keyword Pioneer — speech emotion classification

🐣 Hot Topic Early Bird — emotion recognition

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Egor Lakomkin , Cornelius Weber , Stefan Wermter

Topics

Machine Learning > Learning Types > Self-Supervised Learning Machine Learning > Application Areas > Data Augmentation Speech & Audio > Analysis > Prosody Analysis Interdisciplinary > Social > Affective Computing Speech & Audio > Analysis > Speech Analysis

Keywords

sentiment analysis data augmentation emotion recognition audio analysis recurrent neural network speech emotion classification textual sentiment analysis audio-visual correlation

Download PDF

Related papers

Cross-Lingual Dependency Parsing with Late Decoding for Truly Low-Resource Languages 2017

Learning and Knowledge Transfer with Memory Networks for Machine Comprehension 2017

Is this a Child, a Girl or a Car? Exploring the Contribution of Distributional Similarity to Learning Referential Word Meanings 2017

Building Web-Interfaces for Vector Semantic Models with the WebVectors Toolkit 2017

Assessing Convincingness of Arguments in Online Debates with Limited Number of Features 2017