2019 INTERSPEECH INTERSPEECH 2019

Multimedia Simultaneous Translation System for Minority Language Communication with Mandarin

Abstract

Speech recognition for minority language is always behind main stream due to lack of resources. This work presents a system for simultaneous translation between Mandarin and major minority languages such as Uyghur, Tibetan in shape of speech, text and images. The general acoustic model is trained via factorized TDNN with lattice free MMI criteria using mixed-units based lexicon model. For each specific language, acoustic model is trained by multi-task mix-lingual modeling with shared bottleneck layers followed by transfer learning. Besides, the system also supports state-of-the-art OCR, TTS, and machine translation, by which language information will be real-time translated, punctuated and pronounced. The machine translation behind the system gets a high rank in WMT 18 Mandarin-English and CWMT 18 minority language translation task. The system has integrated into a micro-app at WeChat and can facilitate communication between Mandarin and Minority languages.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Natural Language Processing
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio