Comparative Studies of Detecting Abusive Language on Twitter

Younghun Lee; Seunghyun Yoon; Kyomin Jung

2018 EMNLP EMNLP 2018

Comparative Studies of Detecting Abusive Language on Twitter

Abstract

AbstractThe context-dependent nature of online aggression makes annotating large collections of data extremely difficult. Previously studied datasets in abusive language detection have been insufficient in size to efficiently train deep learning models. Recently, Hate and Abusive Speech on Twitter, a dataset much greater in size and reliability, has been released. However, this dataset has not been comprehensively studied to its potential. In this paper, we conduct the first comparative study of various learning models on Hate and Abusive Speech on Twitter, and discuss the possibility of using additional features and context data for improvements. Experimental results show that bidirectional GRU networks trained on word-level features, with Latent Topic Clustering modules, is the most accurate model scoring 0.805 F1.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — latent topic clustering

🐣 Hot Topic Early Bird — hate speech detection

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Younghun Lee , Seunghyun Yoon , Kyomin Jung

Topics

Machine Learning > Core Methods > Classification Machine Learning > Core Methods > Clustering Deep Learning > Architectures > Transformers Natural Language Processing > Applications > Sentiment Analysis

Keywords

text classification abusive language detection deep learning word embedding hate speech detection bidirectional gru latent topic clustering

Download PDF

Related papers

Speeding Up Neural Machine Translation Decoding by Cube Pruning 2018

Limitations in learning an interpreted language with recurrent models 2018

Results of the sixth edition of the BioASQ Challenge 2018

Neural Segmental Hypergraphs for Overlapping Mention Recognition 2018

Hybrid Neural Attention for Agreement/Disagreement Inference in Online Debates 2018