Why Swear? Analyzing and Inferring the Intentions of Vulgar Expressions

Eric Holgate; Isabel Cachola; Daniel Preoţiuc-Pietro; Junyi Jessy Li

2018 EMNLP EMNLP 2018

Why Swear? Analyzing and Inferring the Intentions of Vulgar Expressions

Abstract

AbstractVulgar words are employed in language use for several different functions, ranging from expressing aggression to signaling group identity or the informality of the communication. This versatility of usage of a restricted set of words is challenging for downstream applications and has yet to be studied quantitatively or using natural language processing techniques. We introduce a novel data set of 7,800 tweets from users with known demographic traits where all instances of vulgar words are annotated with one of the six categories of vulgar word use. Using this data set, we present the first analysis of the pragmatic aspects of vulgarity and how they relate to social factors. We build a model able to predict the category of a vulgar word based on the immediate context it appears in with 67.4 macro F1 across six classes. Finally, we demonstrate the utility of modeling the type of vulgar word use in context by using this information to achieve state-of-the-art performance in hate speech detection on a benchmark data set.

❓ The Questioner

🧭 Keyword Pioneer — vulgar word analysis

🐣 Hot Topic Early Bird — hate speech detection

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Eric Holgate , Isabel Cachola , Daniel Preoţiuc-Pietro , Junyi Jessy Li

Topics

Natural Language Processing > Understanding > Sentiment Analysis Natural Language Processing > Applications > Text Classification

Keywords

sentiment analysis text classification social media hate speech detection vulgar word analysis

Download PDF

Related papers

Speeding Up Neural Machine Translation Decoding by Cube Pruning 2018

Limitations in learning an interpreted language with recurrent models 2018

Results of the sixth edition of the BioASQ Challenge 2018

Neural Segmental Hypergraphs for Overlapping Mention Recognition 2018

Hybrid Neural Attention for Agreement/Disagreement Inference in Online Debates 2018