2024 INTERSPEECH INTERSPEECH 2024

Well, what can you do with messy data? Exploring the prosody and pragmatic function of the discourse marker "well" with found data and speech synthesis

Abstract

Recently, there has been growing interest in the synthesis of conversational speech prosody. Conversational prosody is variable and carries many pragmatic functions. As speech synthesis research moves to using large amounts of untranscribed data, it is crucial that we understand the subtle pragmatic differences prosody can make. This study focuses on discourse markers, which are linguistic elements that perform various communicative functions, with their specific roles often linked to their prosodic realisation. In this paper, we explore the prosodic realisation of well using an unlabelled corpus of conversational speech. We use clustering to explore the variation in its prosodic realisation and identify common patterns in a data-driven manner. We synthesise the cluster centroids using controllable speech synthesis. Finally, we evaluate how the prosodic realisation of well affects the meaning of an utterance.

โ“ The Questioner
๐ŸŒ‰ Interdisciplinary Bridge โ€” Computer Vision and Machine Learning
๐Ÿงญ Keyword Pioneer โ€” conversational prosody
๐Ÿ Cross-Pollinator โ€” Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Machine Learning, Natural Language Processing, Speech & Audio