2024 EMNLP EMNLP 2024

Modeling Gender and Dialect Bias in Automatic Speech Recognition

Abstract

AbstractDialect and gender-based biases have become an area of concern in language-dependent AI systemsincluding around automatic speech recognition (ASR) which processes speech audio into text. These potential biases raise concern for discriminatory outcomes with AI systems depending on demographic- particularly gender discrimination against women, and racial discrimination against minorities with ethnic or cultural English dialects.As such we aim to evaluate the performance of ASR systems across different genders and across dialects of English. Concretely, we take a deep dive of the performance of ASR systems on men and women across four US-based English dialects: Standard American English (SAE), African American Vernacular English (AAVE), Chicano English, and Spanglish. To do this, we construct a labeled dataset of 13 hours of podcast audio, transcribed by speakers of the represented dialects. We then evaluate zero-shot performance of different automatic speech recognition models on our dataset, and further finetune models to better understand how finetuning can impact performance. Our work fills the gap of investigating possible gender disparities within underrepresented dialects.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Speech & Audio
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio