Audio-Visual Prediction of Head-Nod and Turn-Taking Events in Dyadic Interactions

Bekir Berker Türker; Engin Erzin; Yucel Yemez; Metin Sezgin

2018 INTERSPEECH INTERSPEECH 2018

Audio-Visual Prediction of Head-Nod and Turn-Taking Events in Dyadic Interactions

Abstract

Head-nods and turn-taking both significantly contribute conversational dynamics in dyadic interactions. Timely prediction and use of these events is quite valuable for dialog management systems in human-robot interaction. In this study, we present an audio-visual prediction framework for the head-nod and turn-taking events that can also be utilized in real-time systems. Prediction systems based on Support Vector Machines (SVM) and Long Short-Term Memory Recurrent Neural Networks (LSTM-RNN) are trained on human-human conversational data. Unimodal and multimodal classification performances of head-nod and turn-taking events are reported over the IEMOCAP dataset.

📈 Trend Setter — Trajectory Prediction

🧭 Keyword Pioneer — head nod prediction

🐣 Hot Topic Early Bird — multimodal learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Robotics and Speech & Audio

Authors

Bekir Berker Türker , Engin Erzin , Yucel Yemez , Metin Sezgin

Topics

Artificial Intelligence > Core AI > Multimodal Learning Artificial Intelligence > Core AI > Trajectory Prediction Robotics > Capabilities > Human-Robot Interaction Speech & Audio > Analysis > Speech Analysis Machine Learning > Learning Types > Multi-Modal Learning

Keywords

multimodal learning audio-visual learning human-robot interaction support vector machine long short-term memory dialog management dialog system turn-taking prediction head nod prediction audio-visual prediction head-nod detection head nod

Download PDF

Related papers

HoloCompanion: An MR Friend for EveryOne 2018

Estimation of the Vocal Tract Length of Vowel Sounds Based on the Frequency of the Significant Spectral Valley 2018

Deep Learning Techniques for Koala Activity Detection 2018

An Exploration of Local Speaking Rate Variations in Mandarin Read Speech 2018

Acoustic Analysis of Whispery Voice Disguise in Mandarin Chinese 2018