PolyWER: A Holistic Evaluation Framework for Code-Switched Speech Recognition

Karima Kadaoui; Maryam Al Ali; Hawau Olamide Toyin; Ibrahim Mohammed; Hanan Aldarmaki

2024 EMNLP EMNLP 2024

PolyWER: A Holistic Evaluation Framework for Code-Switched Speech Recognition

Abstract

AbstractCode-switching in speech, particularly between languages that use different scripts, can potentially be correctly transcribed in various forms, including different ways of transliteration of the embedded language into the matrix language script. Traditional methods for measuring accuracy, such as Word Error Rate (WER), are too strict to address this challenge. In this paper, we introduce PolyWER, a proposed framework for evaluating speech recognition systems to handle language-mixing. PolyWER accepts transcriptions of code-mixed segments in different forms, including transliterations and translations. We demonstrate the algorithms use cases through detailed examples, and evaluate it against human judgement. To enable the use of this metric, we appended the annotations of a publicly available Arabic-English code-switched dataset with transliterations and translations of code-mixed speech. We also utilize these additional annotations for fine-tuning ASR models and compare their performance using PolyWER. In addition to our main finding on PolyWER’s effectiveness, our experiments show that alternative annotations could be more effective for fine-tuning monolingual ASR models.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing and Speech & Audio

🐣 Hot Topic Early Bird — evaluation framework

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Karima Kadaoui , Maryam Al Ali , Hawau Olamide Toyin , Ibrahim Mohammed , Hanan Aldarmaki

Topics

Machine Learning > Optimization & Theory > Optimization Speech & Audio > Recognition > Automatic Speech Recognition Natural Language Processing > Applications > Speech Recognition Machine Learning > Learning Types > Multi-Lingual Learning

Keywords

speech recognition automatic speech recognition evaluation framework code-switched speech recognition word error rate multilingual speech recognition speech evaluation speech recognition evaluation

Download PDF

Related papers

EmbodiedBERT: Cognitively Informed Metaphor Detection Incorporating Sensorimotor Information 2024

Mitigating Matthew Effect: Multi-Hypergraph Boosted Multi-Interest Self-Supervised Learning for Conversational Recommendation 2024

Learning to Extract Structured Entities Using Language Models 2024

Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis 2024

CSSL: Contrastive Self-Supervised Learning for Dependency Parsing on Relatively Free Word Ordered and Morphologically Rich Low Resource Languages 2024