Mawdoo3 AI at MADAR Shared Task: Arabic Fine-Grained Dialect Identification with Ensemble Learning

Ahmad Ragab; Haitham Seelawi; Mostafa Samir; Abdelrahman Mattar; Hesham Al-Bataineh; Mohammad Zaghloul; Ahmad Mustafa; Bashar Talafha; Abed Alhakim Freihat; Hussein Al-Natsheh

2019 ACL ACL 2019

Mawdoo3 AI at MADAR Shared Task: Arabic Fine-Grained Dialect Identification with Ensemble Learning

Abstract

AbstractIn this paper we discuss several models we used to classify 25 city-level Arabic dialects in addition to Modern Standard Arabic (MSA) as part of MADAR shared task (sub-task 1). We propose an ensemble model of a group of experimentally designed best performing classifiers on a various set of features. Our system achieves an accuracy of 69.3% macro F1-score with an improvement of 1.4% accuracy from the baseline model on the DEV dataset. Our best run submitted model ranked as third out of 19 participating teams on the TEST dataset with only 0.12% macro F1-score behind the top ranked system.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — macro f1-score

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio

🐣 Hot Topic Early Bird — fine-grained classification

Authors

Ahmad Ragab , Haitham Seelawi , Mostafa Samir , Abdelrahman Mattar , Hesham Al-Bataineh , Mohammad Zaghloul , Ahmad Mustafa , Bashar Talafha , Abed Alhakim Freihat , Hussein Al-Natsheh

Topics

Machine Learning > Core Methods > Classification Natural Language Processing > Applications > Text Classification Machine Learning > Learning Types > Ensemble Learning Machine Learning > Core Methods > Ensemble Methods

Keywords

ensemble learning text classification fine-grained classification arabic dialect dialect identification macro f1-score

Download PDF

Related papers

What do phone embeddings learn about Phonology? 2019

Unsupervised Morphological Segmentation for Low-Resource Polysynthetic Languages 2019

Understanding Undesirable Word Embedding Associations 2019

Inferential Machine Comprehension: Answering Questions by Recursively Deducing the Evidence Chain from Text 2019

Domain Adaptation of Neural Machine Translation by Lexicon Induction 2019