2021 INTERSPEECH INTERSPEECH 2021

Language or Paralanguage, This is the Problem: Comparing Depressed and Non-Depressed Speakers Through the Analysis of Gated Multimodal Units

Abstract

Speech-based depression detection has attracted significant attention over the last years. A debated problem is whether it is better to use language (what people say), paralanguage (how they say it) or a combination of the two. This article addresses the question through the analysis of a Gated Multimodal Unit trained to weight modalities according to how effectively they account for the condition of a speaker (depressed or non-depressed). The experiments involved 29 individuals diagnosed with depression and 30 non-depressed participants. Besides an accuracy of 83.0% (F1 score 80.0%), the results show that the Gated Multimodal Unit tends to give more weight to paralanguage. However, the relative contribution of language tends to be higher, to a statistically significant extent, in the case of non-depressed speakers.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning
🧭 Keyword Pioneer — gated multimodal unit
🐣 Hot Topic Early Bird — depression detection
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Robotics, Speech & Audio