Inducing a Lexicon of Abusive Words – a Feature-Based Approach

Michael Wiegand; Josef Ruppenhofer; Anna Schmidt; Clayton Greenberg

2018 NAACL NAACL 2018

Inducing a Lexicon of Abusive Words – a Feature-Based Approach

Abstract

AbstractWe address the detection of abusive words. The task is to identify such words among a set of negative polar expressions. We propose novel features employing information from both corpora and lexical resources. These features are calibrated on a small manually annotated base lexicon which we use to produce a large lexicon. We show that the word-level information we learn cannot be equally derived from a large dataset of annotated microposts. We demonstrate the effectiveness of our (domain-independent) lexicon in the cross-domain detection of abusive microposts.

🧭 Keyword Pioneer — cross-domain detection

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Michael Wiegand , Josef Ruppenhofer , Anna Schmidt , Clayton Greenberg

Topics

Natural Language Processing > Applications > Text Classification Natural Language Processing > Applications > Sentiment Analysis

Keywords

text classification abusive language detection feature-based approach feature engineering lexicon induction cross-domain detection polar expression

Download PDF

Related papers

A Melody-Conditioned Lyrics Language Model 2018

Before Name-Calling: Dynamics and Triggers of Ad Hominem Fallacies in Web Argumentation 2018

Automated Essay Scoring in the Presence of Biased Ratings 2018

Neural Automated Essay Scoring and Coherence Modeling for Adversarially Crafted Input 2018

QuickEdit: Editing Text & Translations by Crossing Words Out 2018