DialectNLU at NADI 2023 Shared Task: Transformer Based Multitask Approach Jointly Integrating Dialect and Machine Translation Tasks in Arabic

Hariram Veeramani; Surendrabikram Thapa; Usman Naseem

2023 EMNLP EMNLP 2023

DialectNLU at NADI 2023 Shared Task: Transformer Based Multitask Approach Jointly Integrating Dialect and Machine Translation Tasks in Arabic

Abstract

AbstractWith approximately 400 million speakers worldwide, Arabic ranks as the fifth most-spoken language globally, necessitating advancements in natural language processing. This paper addresses this need by presenting a system description of the approaches employed for the subtasks outlined in the Nuanced Arabic Dialect Identification (NADI) task at EMNLP 2023. For the first subtask, involving closed country-level dialect identification classification, we employ an ensemble of two Arabic language models. Similarly, for the second subtask, focused on closed dialect to Modern Standard Arabic (MSA) machine translation, our approach combines sequence-to-sequence models, all trained on an Arabic-specific dataset. Our team ranks 10th and 3rd on subtask 1 and subtask 2 respectively.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — transformer based

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Hariram Veeramani , Surendrabikram Thapa , Usman Naseem

Topics

Machine Learning > Core Methods > Classification Natural Language Processing > Applications > Machine Translation Natural Language Processing > Applications > Text Classification Natural Language Processing > Resources & Methods > Multilingual NLP Machine Learning > Learning Types > Multi-Task Learning Deep Learning > Learning Types > Multi-Task Learning Artificial Intelligence > Core AI > Natural Language Processing

Keywords

multi-task learning ensemble learning machine translation multitask learning arabic language model sequence-to-sequence model dialect identification transformer model multitask approach transformer based

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023