2023
INTERSPEECH
INTERSPEECH 2023
Improved Contextualized Speech Representations for Tonal Analysis
Abstract
We propose fine-tuning wav2vec2.0 with a cross-entropy loss to classify tones in an utterance on a frame-by-frame basis. Our study demonstrates that this approach not only improves tone classification accuracy but also generates frame-level representations suitable for tonal analysis. By using these representations, we established that the third-tone-sandhi-rising tone in Mandarin speech differs from the lexical rising tone, and the third tone that doesn't undergo sandhi differs from the third tone that's not in a sandhi context. Our findings suggest that third-tone sandhi in Mandarin Chinese involves a continuous shift from Tone3 to Tone2, rather than a categorical change.
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Machine Learning
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Machine Learning, Natural Language Processing, Speech & Audio
🧭
Keyword Pioneer
— tonal analysis