Toward Fairness in Speech Recognition: Discovery and mitigation of performance disparities

Pranav Dheram; Murugesan Ramakrishnan; Anirudh Raju; I-Fan Chen; Brian King; Katherine Powell; Melissa Saboowala; Karan Shetty; Andreas Stolcke

2022 INTERSPEECH INTERSPEECH 2022

Toward Fairness in Speech Recognition: Discovery and mitigation of performance disparities

Abstract

As for other forms of AI, speech recognition has recently been examined with respect to performance disparities across different user cohorts. One approach to achieve fairness in speech recognition is to (1) identify speaker cohorts that suffer from subpar performance and (2) apply fairness mitigation measures targeting the cohorts so discovered. In this paper, we report on initial findings with both discovery and mitigation of performance disparities using data from a product-scale AI assistant speech recognition system. We compare cohort discovery based on geographic and demographic information to a more scalable method that groups speakers without human labels, using speaker embedding technology. For fairness mitigation, we find that oversampling of underrepresented cohorts, as well as modeling speaker cohort membership by additional input variables, is able to reduce the gap between top- and bottom-performing cohorts, without deteriorating overall recognition accuracy. Index Terms: speech recognition, performance fairness, cohort discovery.

🌉 Interdisciplinary Bridge — Machine Learning and Speech & Audio

🧭 Keyword Pioneer — cohort discovery

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Pranav Dheram , Murugesan Ramakrishnan , Anirudh Raju , I-Fan Chen , Brian King , Katherine Powell , Melissa Saboowala , Karan Shetty , Andreas Stolcke

Topics

Machine Learning > Application Areas > Fairness Speech & Audio > Recognition > Speech Recognition

Keywords

automatic speech recognition speaker embedding performance disparity fairness mitigation cohort discovery

Download PDF

Related papers

Example-based Explanations with Adversarial Attacks for Respiratory Sound Analysis 2022

Which Model is Best: Comparing Methods and Metrics for Automatic Laughter Detection in a Naturalistic Conversational Dataset 2022

Evidence of Onset and Sustained Neural Responses to Isolated Phonemes from Intracranial Recordings in a Voice-based Cursor Control Task 2022

Pre-trained Speech Representations as Feature Extractors for Speech Quality Assessment in Online Conferencing Applications 2022

Exploring the influence of fine-tuning data on wav2vec 2.0 model for blind speech quality prediction 2022