2020
COLING
COLING 2020
KUISAIL at SemEval-2020 Task 12: BERT-CNN for Offensive Speech Identification in Social Media
Abstract
AbstractIn this paper, we describe our approach to utilize pre-trained BERT models with Convolutional Neural Networks for sub-task A of the Multilingual Offensive Language Identification shared task (OffensEval 2020), which is a part of the SemEval 2020. We show that combining CNN with BERT is better than using BERT on its own, and we emphasize the importance of utilizing pre-trained language models for downstream tasks. Our system, ranked 4th with macro averaged F1-Score of 0.897 in Arabic, 4th with score of 0.843 in Greek, and 3rd with score of 0.814 in Turkish. Additionally, we present ArabicBERT, a set of pre-trained transformer language models for Arabic that we share with the community.
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Deep Learning and Machine Learning
🧭
Keyword Pioneer
— offensive speech identification
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Topics
Artificial Intelligence > Learning Paradigms > Transfer Learning
Machine Learning > Core Methods > Classification
Deep Learning > Architectures > Transformers
Deep Learning > Architectures > Neural Networks
Deep Learning > Techniques > Pretraining
Machine Learning > Learning Types > Transfer Learning
Keywords
transfer learning
text classification
offensive language detection
convolutional neural network
language model
pre-trained language model
transformer language model
social media
bidirectional encoder representations from transformer
offensive speech detection
arabic language processing
transformer model
offensive speech identification