ASR for South Slavic Languages Developed in Almost Automated Way

Jan Nouza; Radek Safarik; Petr Cerva

2016 INTERSPEECH INTERSPEECH 2016

ASR for South Slavic Languages Developed in Almost Automated Way

Abstract

Slavic languages pose several specific challenges that need to be addressed in an ASR system design. Since we have already built an engine suited for highly-inflected languages, we focus on adopting it for new languages, now. In this case, we present an efficient way to adapt the system to all (seven) South Slavic languages, using methods and tools that benefit from language similarities, easily adjustable G2P rules or common phonetic subsets. We show that it is possible to build accurate language and acoustic models in an almost automated way, entirely from resources found on the web. The AMs are trained via cross-lingual bootstrapping followed by lightly supervised retraining from public data, like broadcast and parliament archives. Tests done on a set of main broadcast news in each language show WER values in range 16.8 to 21.5%, which includes also errors caused by OOL (out-of-language) utterances often occurring in this type of spoken programs.

🚀 Conference Pioneer — INTERSPEECH 2016

🌉 Interdisciplinary Bridge — Machine Learning and Speech & Audio

📈 Trend Setter — Transfer Learning

🧭 Keyword Pioneer — south slavic language

🐣 Hot Topic Early Bird — word error rate

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Natural Language Processing, Speech & Audio