2020 INTERSPEECH INTERSPEECH 2020

A 43 Language Multilingual Punctuation Prediction Neural Network Model

Abstract

Punctuation prediction is a critical component for speech recognition readability and speech translation segmentation. When considering multiple language support, traditional monolingual neural network models used for punctuation prediction can be costly to manage and may not produce the best accuracy. In this paper, we investigate multilingual Long Short-Term Memory (LSTM) modeling using Byte Pair Encoding (BPE) for punctuation prediction to support 43 languages1 across 69 countries. Our findings show a single multilingual BPE-based model can achieve similar or even better performance than separate monolingual word-based models by benefiting from shared information across different languages. On an in-domain news text test set, the multilingual model achieves on average 80.2% F1-score while on out-of-domain speech recognition text, it achieves 73.5% F1-score. We also show that the shared information can help in fine-tuning for low-resource languages as well.

🌉 Interdisciplinary Bridge — Natural Language Processing and Speech & Audio
🧭 Keyword Pioneer — multilingual punctuation prediction
🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio