Reducing Unintended Identity Bias in Russian Hate Speech Detection

Nadezhda Zueva; Madina Kabirova; Pavel Kalaidin

2020 EMNLP EMNLP 2020

Reducing Unintended Identity Bias in Russian Hate Speech Detection

Abstract

AbstractToxicity has become a grave problem for many online communities, and has been growing across many languages, including Russian. Hate speech creates an environment of intimidation, discrimination, and may even incite some real-world violence. Both researchers and social platforms have been focused on developing models to detect toxicity in online communication for a while now. A common problem of these models is the presence of bias towards some words (e.g. woman, black, jew or женщина, черный, еврей) that are not toxic, but serve as triggers for the classifier due to model caveats. In this paper, we describe our efforts towards classifying hate speech in Russian, and propose simple techniques of reducing unintended bias, such as generating training data with language models using terms and words related to protected identities as context and applying word dropout to such words.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — identity bia

🐣 Hot Topic Early Bird — toxicity detection

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Nadezhda Zueva , Madina Kabirova , Pavel Kalaidin

Topics

Machine Learning > Core Methods > Classification Machine Learning > Application Areas > Fairness Natural Language Processing > Applications > Text Classification Machine Learning > Learning Types > Deep Learning Machine Learning > Learning Types > Fairness

Keywords

text classification data augmentation toxicity detection language model bias reduction hate speech detection russian language identity bia

Download PDF

Related papers

Fast semantic parsing with well-typedness guarantees 2020

Detecting Objectifying Language in Online Professor Reviews 2020

Analogous Process Structure Induction for Sub-event Sequence Prediction 2020

Aspect Sentiment Classification with Aspect-Specific Opinion Spans 2020

Robust and Interpretable Grounding of Spatial References with Relation Networks 2020