iCompass at SemEval-2020 Task 12: From a Syntax-ignorant N-gram Embeddings Model to a Deep Bidirectional Language Model

Abir Messaoudi; Hatem Haddad; Moez Ben Haj Hmida

2020 SEMEVAL SemEval 2020

iCompass at SemEval-2020 Task 12: From a Syntax-ignorant N-gram Embeddings Model to a Deep Bidirectional Language Model

Abstract

AbstractWe describe our submitted system to the SemEval 2020. We tackled Task 12 entitled “Multilingual Offensive Language Identification in Social Media”, specifically subtask 4A-Arabic. We propose three Arabic offensive language identification models: Tw-StAR, BERT and BERT+BiLSTM. Two Arabic abusive/hate datasets were added to the training dataset: L-HSAB and T-HSAB. The final submission was chosen based on the best performances which was achieved by the BERT+BiLSTM model.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Speech & Audio