2022 INTERSPEECH INTERSPEECH 2022

Speech imitation skills predict automatic phonetic convergence: a GMM-UBM study on L2

Abstract

Phonetic convergence is the observation that two interlocutors adapt their speech towards one another on an acoustic-phonetic level. It happens automatically and unconsciously, but people can also deliberately imitate others when asked to do so. Here, we investigate to what degree people converge to their interlocutor in a scripted dialogue when they are and when they are not explicitly requested to imitate their interlocutor. More specifically, we collected two separate data sets, where Italian- and French-native participants read English sentences aloud in alternating speaking turns. The results of both groups with different language backgrounds were compared against each other. We used a Gaussian mixture model – universal background model (GMM-UBM) to assess phonetic convergence on the sentence level. The GMM-UBM configuration was optimized to make the best distinction between speakers on validation data. We found that people start to converge to one another while interacting compared to the baseline and even more substantially when explicitly asked to do so. Results are robust across data sets. More importantly, the degree of implicit convergence people display is related to how good of an explicit imitator they are, supporting the claim that the two phenomena are based on the same neurocognitive process.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Mathematics & Optimization
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio