Deeper Attention to Abusive User Content Moderation

John Pavlopoulos; Prodromos Malakasiotis; Ion Androutsopoulos

2017 EMNLP EMNLP 2017

Deeper Attention to Abusive User Content Moderation

Abstract

AbstractExperimenting with a new dataset of 1.6M user comments from a news portal and an existing dataset of 115K Wikipedia talk page comments, we show that an RNN operating on word embeddings outpeforms the previous state of the art in moderation, which used logistic regression or an MLP classifier with character or word n-grams. We also compare against a CNN operating on word embeddings, and a word-list baseline. A novel, deep, classificationspecific attention mechanism improves the performance of the RNN further, and can also highlight suspicious words for free, without including highlighted words in the training data. We consider both fully automatic and semi-automatic moderation.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — content moderation

🐣 Hot Topic Early Bird — content moderation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

John Pavlopoulos , Prodromos Malakasiotis , Ion Androutsopoulos

Topics

Artificial Intelligence > Core AI > Interpretability Machine Learning > Core Methods > Classification Natural Language Processing > Applications > Text Classification Natural Language Processing > Applications > Sentiment Analysis Machine Learning > Learning Types > Classification Deep Learning > Learning Types > Deep Learning Deep Learning > Techniques > Attention

Keywords

text classification attention mechanism content moderation recurrent neural network word embedding abusive content detection abusive content moderation deep attention mechanism suspicious word highlighting semi-automatic moderation

Download PDF

Related papers

Reinforced Video Captioning with Entailment Rewards 2017

Cross-lingual Character-Level Neural Morphological Tagging 2017

Inter-Weighted Alignment Network for Sentence Pair Modeling 2017

Investigating Different Syntactic Context Types and Context Representations for Learning Word Embeddings 2017

An Empirical Analysis of Edit Importance between Document Versions 2017