Bias Silhouette Analysis: Towards Assessing the Quality of Bias Metrics for Word Embedding Models

Maximilian Spliethöver; Henning Wachsmuth

2021 IJCAI IJCAI 2021

Bias Silhouette Analysis: Towards Assessing the Quality of Bias Metrics for Word Embedding Models

Abstract

Word embedding models reflect bias towards genders, ethnicities, and other social groups present in the underlying training data. Metrics such as ECT, RNSB, and WEAT quantify bias in these models based on predefined word lists representing social groups and bias-conveying concepts. How suitable these lists actually are to reveal bias - let alone the bias metrics in general - remains unclear, though. In this paper, we study how to assess the quality of bias metrics for word embedding models. In particular, we present a generic method, Bias Silhouette Analysis (BSA), that quantifies the accuracy and robustness of such a metric and of the word lists used. Given a biased and an unbiased reference embedding model, BSA applies the metric systematically for several subsets of the lists to the models. The variance and rate of convergence of the bias values of each model then entail the robustness of the word lists, whereas the distance between the models' values gives indications of the general accuracy of the metric with the word lists. We demonstrate the behavior of BSA on two standard embedding models for the three mentioned metrics with several word lists from existing research.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization

🧭 Keyword Pioneer — bias metric

🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Robotics, Speech & Audio

🐣 Hot Topic Early Bird — embedding model

Authors

Maximilian Spliethöver , Henning Wachsmuth

Topics

Machine Learning > Core Methods > Metric Learning Machine Learning > Application Areas > Fairness Natural Language Processing > Resources & Methods > Text Representation Mathematics & Optimization > Mathematics > Statistics Machine Learning > Optimization & Theory > Statistics Artificial Intelligence > Core AI > Fairness Machine Learning > Learning Types > Fairness

Keywords

word embedding social bia bias metric robustness analysis silhouette analysis metric accuracy gender bia embedding model ethnic bia

Download PDF

Related papers

Type Anywhere You Want: An Introduction to Invisible Mobile Keyboard 2021

Guaranteeing Maximin Shares: Some Agents Left Behind 2021

Surprisingly Popular Voting Recovers Rankings, Surprisingly! 2021

Strategyproof Randomized Social Choice for Restricted Sets of Utility Functions 2021

Diversity in Kemeny Rank Aggregation: A Parameterized Approach 2021