Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods

Jieyu Zhao; Tianlu Wang; Mark Yatskar; Vicente Ordonez; Kai-Wei Chang

2018 NAACL NAACL 2018

Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods

Abstract

AbstractIn this paper, we introduce a new benchmark for co-reference resolution focused on gender bias, WinoBias. Our corpus contains Winograd-schema style sentences with entities corresponding to people referred by their occupation (e.g. the nurse, the doctor, the carpenter). We demonstrate that a rule-based, a feature-rich, and a neural coreference system all link gendered pronouns to pro-stereotypical entities with higher accuracy than anti-stereotypical entities, by an average difference of 21.1 in F1 score. Finally, we demonstrate a data-augmentation approach that, in combination with existing word-embedding debiasing techniques, removes the bias demonstrated by these systems in WinoBias without significantly affecting their performance on existing datasets.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing

📈 Trend Setter — Data Augmentation

🧭 Keyword Pioneer — word embedding debiasing

🐣 Hot Topic Early Bird — gender bia

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Jieyu Zhao , Tianlu Wang , Mark Yatskar , Vicente Ordonez , Kai-Wei Chang

Topics

Machine Learning > Application Areas > Fairness Natural Language Processing > Understanding > Coreference Resolution Artificial Intelligence > Core AI > Fairness Machine Learning > Learning Types > Data Augmentation

Keywords

data augmentation coreference resolution gender bia word embedding debiasing stereotypical entities

Download PDF

Related papers

A Melody-Conditioned Lyrics Language Model 2018

Before Name-Calling: Dynamics and Triggers of Ad Hominem Fallacies in Web Argumentation 2018

Automated Essay Scoring in the Presence of Biased Ratings 2018

Neural Automated Essay Scoring and Coherence Modeling for Adversarially Crafted Input 2018

QuickEdit: Editing Text & Translations by Crossing Words Out 2018