Mitigating Biases in Hate Speech Detection from A Causal Perspective

Zhehao Zhang; Jiaao Chen; Diyi Yang

2023 EMNLP EMNLP 2023

Mitigating Biases in Hate Speech Detection from A Causal Perspective

Abstract

AbstractNowadays, many hate speech detectors are built to automatically detect hateful content. However, their training sets are sometimes skewed towards certain stereotypes (e.g., race or religion-related). As a result, the detectors are prone to depend on some shortcuts for predictions. Previous works mainly focus on token-level analysis and heavily rely on human experts’ annotations to identify spurious correlations, which is not only costly but also incapable of discovering higher-level artifacts. In this work, we use grammar induction to find grammar patterns for hate speech and analyze this phenomenon from a causal perspective. Concretely, we categorize and verify different biases based on their spuriousness and influence on the model prediction. Then, we propose two mitigation approaches including Multi-Task Intervention and Data-Specific Intervention based on these confounders. Experiments conducted on 9 hate speech datasets demonstrate the effectiveness of our approaches.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Knowledge & Reasoning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — causal perspective

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Zhehao Zhang , Jiaao Chen , Diyi Yang

Topics

Artificial Intelligence > Core AI > Causal Inference Machine Learning > Application Areas > Fairness Natural Language Processing > Applications > Text Classification Knowledge & Reasoning > Reasoning > Causal Inference Machine Learning > Learning Types > Multi-Task Learning Artificial Intelligence > Core AI > Fairness Machine Learning > Learning Types > Causal Inference Artificial Intelligence > Core AI > Natural Language Processing

Keywords

causal inference grammar induction bias mitigation spurious correlation hate speech detection causal perspective multi-task intervention

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023