2020
INTERSPEECH
INTERSPEECH 2020
Language Modeling for Speech Analytics in Under-Resourced Languages
Abstract
Different language modeling approaches are evaluated on two under-resourced, agglutinative, South African languages; Sesotho and isiZulu. The two languages present different challenges to language modeling based on their respective orthographies; isiZulu is conjunctively written whereas Sotho is disjunctively written. Two subword modeling approaches are evaluated and shown to be useful to reduce the OOV rate for isiZulu, and for Sesotho, a multi-word approach is evaluated for improving ASR accuracy, with limited success. RNNs are also evaluated and shown to slightly improve ASR accuracy, despite relatively small text corpora.
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Natural Language Processing
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio
Authors
Topics
Artificial Intelligence > Learning Paradigms > Transfer Learning
Natural Language Processing > Generation > Language Modeling
Natural Language Processing > Resources & Methods > Multilingual NLP
Speech & Audio > Recognition > Automatic Speech Recognition
Natural Language Processing > Resources & Methods > Language Modeling
Machine Learning > Learning Types > Language Modeling