2019 INTERSPEECH INTERSPEECH 2019

A Mandarin Prosodic Boundary Prediction Model Based on Multi-Task Learning

Abstract

In this paper, we propose a mandarin prosodic boundary prediction model based on Multi-Task Learning (MTL) architecture. The prosody structure of mandarin is a three-level hierarchical structure, which contains three basic units — Prosodic Word (PW), Prosodic Phrase (PPH) and Intonational Phrase (IPH) [1]. Previous studies usually decompose mandarin prosodic boundary prediction task into three independent tasks on these three unit boundaries [1–4]. In recent years, with the development of deep learning, MTL has achieved state-of-the-art performance on many tasks in Natural Language Processing (NLP) field [5–7]. Inspired by this, this paper implements an MTL framework with Bidirectional Long-Short Term Memory and Conditional Random Field (BLSTM-CRF) as the basic model, and takes three independent tasks of mandarin prosodic boundary prediction as sub-modules for PW, PPH and IPH individually. Under the MTL architecture, the three independent tasks are unified for overall optimization. The experiment results show that our model is effective in solving the task of mandarin prosodic boundary prediction, in which the overall prediction performance is improved by 0.8%, and the model size is reduced by about 55%.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning
🧭 Keyword Pioneer — bidirectional long short-term memory conditional random field
🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio