Debiasing Static Embeddings for Hate Speech Detection

Ling Sun; Soyoung Kim; Xiao Dong; Sandra Kübler

2025 ACL ACL 2025

Debiasing Static Embeddings for Hate Speech Detection

Abstract

AbstractWe examine how embedding bias affects hate speech detection by evaluating two debiasing methods—hard-debiasing and soft-debiasing. We analyze stereotype and sentiment associations within the embedding space and assess whether debiased models reduce censorship of marginalized authors while improving detection of hate speech targeting these groups. Our findings highlight how embedding bias propagates into downstream tasks and demonstrates how well different embedding bias metrics can predict bias in hate speech detection.

🧭 Keyword Pioneer — embedding bia

🐝 Cross-Pollinator — Artificial Intelligence, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Natural Language Processing, Speech & Audio

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing