KUISAIL at SemEval-2020 Task 12: BERT-CNN for Offensive Speech Identification in Social Media

Ali Safaya; Moutasem Abdullatif; Deniz Yuret

2020 COLING COLING 2020

KUISAIL at SemEval-2020 Task 12: BERT-CNN for Offensive Speech Identification in Social Media

Abstract

AbstractIn this paper, we describe our approach to utilize pre-trained BERT models with Convolutional Neural Networks for sub-task A of the Multilingual Offensive Language Identification shared task (OffensEval 2020), which is a part of the SemEval 2020. We show that combining CNN with BERT is better than using BERT on its own, and we emphasize the importance of utilizing pre-trained language models for downstream tasks. Our system, ranked 4th with macro averaged F1-Score of 0.897 in Arabic, 4th with score of 0.843 in Greek, and 3rd with score of 0.814 in Turkish. Additionally, we present ArabicBERT, a set of pre-trained transformer language models for Arabic that we share with the community.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning

🧭 Keyword Pioneer — offensive speech identification

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ali Safaya , Moutasem Abdullatif , Deniz Yuret

Topics

Artificial Intelligence > Learning Paradigms > Transfer Learning Machine Learning > Core Methods > Classification Deep Learning > Architectures > Transformers Deep Learning > Architectures > Neural Networks Deep Learning > Techniques > Pretraining Machine Learning > Learning Types > Transfer Learning

Keywords

transfer learning text classification offensive language detection convolutional neural network language model pre-trained language model transformer language model social media bidirectional encoder representations from transformer offensive speech detection arabic language processing transformer model offensive speech identification

Download PDF

Related papers

Persuasiveness of News Editorials depending on Ideology and Personality 2020

A Graph Representation of Semi-structured Data for Web Question Answering 2020

Span-based Joint Entity and Relation Extraction with Attention-based Span-specific and Contextual Semantic Representations 2020

Hierarchical Chinese Legal event extraction via Pedal Attention Mechanism 2020

End-to-End Emotion-Cause Pair Extraction with Graph Convolutional Network 2020