KEIS@JUST at SemEval-2020 Task 12: Identifying Multilingual Offensive Tweets Using Weighted Ensemble and Fine-Tuned BERT

Saja Tawalbeh; Mahmoud Hammad; Mohammad AL-Smadi

2020 COLING COLING 2020

KEIS@JUST at SemEval-2020 Task 12: Identifying Multilingual Offensive Tweets Using Weighted Ensemble and Fine-Tuned BERT

Abstract

AbstractThis research presents our team KEIS@JUST participation at SemEval-2020 Task 12 which represents shared task on multilingual offensive language. We participated in all the provided languages for all subtasks except sub-task-A for the English language. Two main approaches have been developed the first is performed to tackle both languages Arabic and English, a weighted ensemble consists of Bi-GRU and CNN followed by Gaussian noise and global pooling layer multiplied by weights to improve the overall performance. The second is performed for other languages, a transfer learning from BERT beside the recurrent neural networks such as Bi-LSTM and Bi-GRU followed by a global average pooling layer. Word embedding and contextual embedding have been used as features, moreover, data augmentation has been used only for the Arabic language.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning

🧭 Keyword Pioneer — multilingual offensive language

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Saja Tawalbeh , Mahmoud Hammad , Mohammad AL-Smadi

Topics

Artificial Intelligence > Learning Paradigms > Transfer Learning Machine Learning > Core Methods > Classification Machine Learning > Application Areas > Data Augmentation Deep Learning > Architectures > Transformers Deep Learning > Architectures > Neural Networks Machine Learning > Learning Types > Transfer Learning

Keywords

transfer learning text classification data augmentation offensive language detection convolutional neural network bidirectional lstm gated recurrent unit weighted ensemble bidirectional gru bidirectional encoder representations from transformer offensive language bidirectional gated recurrent unit multilingual offensive language offensive tweet detection

Download PDF

Related papers

Persuasiveness of News Editorials depending on Ideology and Personality 2020

A Graph Representation of Semi-structured Data for Web Question Answering 2020

Span-based Joint Entity and Relation Extraction with Attention-based Span-specific and Contextual Semantic Representations 2020

Hierarchical Chinese Legal event extraction via Pedal Attention Mechanism 2020

End-to-End Emotion-Cause Pair Extraction with Graph Convolutional Network 2020