Multimedia Simultaneous Translation System for Minority Language Communication with Mandarin
Abstract
Speech recognition for minority language is always behind main stream due to lack of resources. This work presents a system for simultaneous translation between Mandarin and major minority languages such as Uyghur, Tibetan in shape of speech, text and images. The general acoustic model is trained via factorized TDNN with lattice free MMI criteria using mixed-units based lexicon model. For each specific language, acoustic model is trained by multi-task mix-lingual modeling with shared bottleneck layers followed by transfer learning. Besides, the system also supports state-of-the-art OCR, TTS, and machine translation, by which language information will be real-time translated, punctuated and pronounced. The machine translation behind the system gets a high rank in WMT 18 Mandarin-English and CWMT 18 minority language translation task. The system has integrated into a micro-app at WeChat and can facilitate communication between Mandarin and Minority languages.