Reducing Gender Bias in Abusive Language Detection

Ji Ho Park; Jamin Shin; Pascale Fung

2018 EMNLP EMNLP 2018

Reducing Gender Bias in Abusive Language Detection

Abstract

AbstractAbusive language detection models tend to have a problem of being biased toward identity words of a certain group of people because of imbalanced training datasets. For example, “You are a good woman” was considered “sexist” when trained on an existing dataset. Such model bias is an obstacle for models to be robust enough for practical use. In this work, we measure them on models trained with different datasets, while analyzing the effect of different pre-trained word embeddings and model architectures. We also experiment with three mitigation methods: (1) debiased word embeddings, (2) gender swap data augmentation, and (3) fine-tuning with a larger corpus. These methods can effectively reduce model bias by 90-98% and can be extended to correct model bias in other scenarios.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — abusive language detection

🐣 Hot Topic Early Bird — gender bia

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ji Ho Park , Jamin Shin , Pascale Fung

Topics

Machine Learning > Application Areas > Data Augmentation Machine Learning > Application Areas > Fairness Natural Language Processing > Applications > Text Classification Machine Learning > Learning Types > Fairness Artificial Intelligence > Core AI > Natural Language Processing

Keywords

data augmentation abusive language detection word embedding model debiasing gender bia

Download PDF

Related papers

Speeding Up Neural Machine Translation Decoding by Cube Pruning 2018

Limitations in learning an interpreted language with recurrent models 2018

Results of the sixth edition of the BioASQ Challenge 2018

Neural Segmental Hypergraphs for Overlapping Mention Recognition 2018

Hybrid Neural Attention for Agreement/Disagreement Inference in Online Debates 2018