Pseudo-Labeling for Domain-Agnostic Bangla Automatic Speech Recognition

Rabindra Nath Nandi; Mehadi Menon; Tareq Muntasir; Sagor Sarker; Quazi Sarwar Muhtaseem; Md. Tariqul Islam; Shammur Chowdhury; Firoj Alam

2023 EMNLP EMNLP 2023

Pseudo-Labeling for Domain-Agnostic Bangla Automatic Speech Recognition

Abstract

AbstractOne of the major challenges for developing automatic speech recognition (ASR) for low-resource languages is the limited access to labeled data with domain-specific variations. In this study, we propose a pseudo-labeling approach to develop a large-scale domain-agnostic ASR dataset. With the proposed methodology, we developed a 20k+ hours labeled Bangla speech dataset covering diverse topics, speaking styles, dialects, noisy environments, and conversational scenarios. We then exploited the developed corpus to design a conformer-based ASR system. We benchmarked the trained ASR with publicly available datasets and compared it with other available models. To investigate the efficacy, we designed and developed a human-annotated domain-agnostic test set composed of news, telephony, and conversational data among others. Our results demonstrate the efficacy of the model trained on psuedo-label data for the designed test-set along with publicly-available Bangla datasets. The experimental resources will be publicly available.https://github.com/hishab-nlp/Pseudo-Labeling-for-Domain-Agnostic-Bangla-ASR

🌉 Interdisciplinary Bridge — Machine Learning and Speech & Audio

🧭 Keyword Pioneer — bangla speech

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Rabindra Nath Nandi , Mehadi Menon , Tareq Muntasir , Sagor Sarker , Quazi Sarwar Muhtaseem , Md. Tariqul Islam , Shammur Chowdhury , Firoj Alam

Topics

Machine Learning > Learning Types > Semi-Supervised Learning Machine Learning > Application Areas > Domain Adaptation Speech & Audio > Recognition > Automatic Speech Recognition

Keywords

domain adaptation automatic speech recognition low-resource language bangla speech

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023