Detecting Unintended Social Bias in Toxic Language Datasets

Nihar Sahoo; Himanshu Gupta; Pushpak Bhattacharyya

2022 CONLL CoNLL 2022

Detecting Unintended Social Bias in Toxic Language Datasets

Abstract

AbstractWith the rise of online hate speech, automatic detection of Hate Speech, Offensive texts as a natural language processing task is getting popular. However, very little research has been done to detect unintended social bias from these toxic language datasets. This paper introduces a new dataset ToxicBias curated from the existing dataset of Kaggle competition named “Jigsaw Unintended Bias in Toxicity Classification”. We aim to detect social biases, their categories, and targeted groups. The dataset contains instances annotated for five different bias categories, viz., gender, race/ethnicity, religion, political, and LGBTQ. We train transformer-based models using our curated datasets and report baseline performance for bias identification, target generation, and bias implications. Model biases and their mitigation are also discussed in detail. Our study motivates a systematic extraction of social bias data from toxic language datasets.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Nihar Sahoo , Himanshu Gupta , Pushpak Bhattacharyya

Topics

Machine Learning > Core Methods > Classification Machine Learning > Application Areas > Fairness Deep Learning > Architectures > Transformers

Keywords

bias mitigation toxicity classification social bia hate speech detection transformer model target generation

Download PDF

Related papers

How Hate Speech Varies by Target Identity: A Computational Analysis 2022

Continual Learning for Natural Language Generations with Transformer Calibration 2022

Causal Analysis of Syntactic Agreement Neurons in Multilingual Language Models 2022

Parsing as Deduction Revisited: Using an Automatic Theorem Prover to Solve an SMT Model of a Minimalist Parser 2022

Leveraging a New Spanish Corpus for Multilingual and Cross-lingual Metaphor Detection 2022