Is a bunch of words enough to detect disagreement in hateful content?

Giulia Rizzi; Paolo Rosso; Elisabetta Fersini

2025 COLING COLING 2025

Is a bunch of words enough to detect disagreement in hateful content?

Abstract

AbstractThe complexity of the annotation process when adopting crowdsourcing platforms for labeling hateful content can be linked to the presence of textual constituents that can be ambiguous, misinterpreted, or characterized by a reduced surrounding context. In this paper, we address the problem of perspectivism in hateful speech by leveraging contextualized embedding representation of their constituents and weighted probability functions. The effectiveness of the proposed approach is assessed using four datasets provided for the SemEval 2023 Task 11 shared task. The results emphasize that a few elements can serve as a proxy to identify sentences that may be perceived differently by multiple readers, without the need of necessarily exploiting complex Large Language Models.

❓ The Questioner

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — weighted probability

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Speech & Audio

Authors

Giulia Rizzi , Paolo Rosso , Elisabetta Fersini

Topics

Machine Learning > Core Methods > Classification Natural Language Processing > Understanding > Sentiment Analysis Natural Language Processing > Applications > Sentiment Analysis Artificial Intelligence > Core AI > Fairness Machine Learning > Learning Types > Multi-Label Classification

Keywords

contextualized embedding hateful content detection disagreement detection weighted probability

Download PDF

Related papers

Navigating Dialectal Bias and Ethical Complexities in Levantine Arabic Hate Speech Detection 2025

TaCIE: Enhancing Instruction Comprehension in Large Language Models through Task-Centred Instruction Evolution 2025

Positive Text Reframing under Multi-strategy Optimization 2025

RAM2C: A Liberal Arts Educational Chatbot based on Retrieval-augmented Multi-role Multi-expert Collaboration 2025

Two-stage Incomplete Utterance Rewriting on Editing Operation 2025