2019 INTERSPEECH INTERSPEECH 2019

Low Resource Automatic Intonation Classification Using Gated Recurrent Unit (GRU) Networks Pre-Trained with Synthesized Pitch Patterns

Abstract

Second language learners of British English (BE) are typically trained to learn four intonation classes — Glide-up, Glide-down, Dive and Take-off. We predict the intonation class in a learner’s utterance by modeling the temporal dependencies in the pitch patterns with gated recurrent unit (GRU) networks. For these, we pre-train the GRU network using a set of synthesized pitch patterns representing each intonation class. For the synthesis, we propose to obtain pitch patterns from the tone sequences representing each intonation class obtained from domain knowledge. Experiments are conducted on speech data collected from experts in a spoken English training material for teaching BE intonation. The absolute improvements in the unweighted average recall (UAR) using the proposed scheme with pre-training are found to be 4.14% and 6.01% respectively over the proposed approach without pre-training and the baseline scheme that uses hidden Markov models (HMMs).

🧭 Keyword Pioneer — intonation classification
🐣 Hot Topic Early Bird — low-resource learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio