Arabic Train at NADI 2024 shared task: LLMs’ Ability to Translate Arabic Dialects into Modern Standard Arabic

Anastasiia Demidova; Hanin Atwany; Nour Rabih; Sanad Sha’ban

2024 ACL ACL 2024

Arabic Train at NADI 2024 shared task: LLMs’ Ability to Translate Arabic Dialects into Modern Standard Arabic

Abstract

AbstractNavigating the intricacies of machine translation (MT) involves tackling the nuanced disparities between Arabic dialects and Modern Standard Arabic (MSA), presenting a formidable obstacle. In this study, we delve into Subtask 3 of the NADI shared task (CITATION), focusing on the translation of sentences from four distinct Arabic dialects into MSA. Our investigation explores the efficacy of various models, including Jais, NLLB, GPT-3.5, and GPT-4, in this dialect-to-MSA translation endeavor. Our findings reveal that Jais surpasses all other models, boasting an average BLEU score of 19.48 in the combination of zero- and few-shot setting, whereas NLLB exhibits the least favorable performance, garnering a BLEU score of 8.77.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Anastasiia Demidova , Hanin Atwany , Nour Rabih , Sanad Sha’ban

Topics

Natural Language Processing > Applications > Machine Translation Natural Language Processing > Resources & Methods > Large Language Models

Keywords

zero-shot learning few-shot learning machine translation arabic dialect large language model modern standard arabic

Download PDF

Related papers

Reinforcement Learning-Driven LLM Agent for Automated Attacks on LLMs 2024

EtymoLink: A Structured English Etymology Dataset 2024

Turkish Delights: A Dataset on Turkish Euphemisms 2024

Subjectivity Detection in English News using Large Language Models 2024

Does DetectGPT Fully Utilize Perturbation? Bridging Selective Perturbation to Fine-tuned Contrastive Learning Detector would be Better 2024