LLMs for Low Resource Languages in Multilingual, Multimodal and Dialectal Settings

Firoj Alam; Shammur Absar Chowdhury; Sabri Boughorbel; Maram Hasanain

2024 EACL EACL 2024

LLMs for Low Resource Languages in Multilingual, Multimodal and Dialectal Settings

Abstract

AbstractThe recent breakthroughs in Artificial Intelligence (AI) can be attributed to the remarkable performance of Large Language Models (LLMs) across a spectrum of research areas (e.g., machine translation, question-answering, automatic speech recognition, text-to-speech generation) and application domains (e.g., business, law, healthcare, education, and psychology). The success of these LLMs largely de- pends on specific training techniques, most notably instruction tuning, RLHF, and subsequent prompting to achieve the desired output. As the development of such LLMs continues to increase in both closed and open settings, evaluation has become crucial for understanding their generalization capabilities across different tasks, modalities, languages, and dialects. This evaluation process is tightly coupled with prompting, which plays a key role in obtain- ing better outputs. There has been attempts to evaluate such models focusing on diverse tasks, languages, and dialects, which suggests that the capabilities of LLMs are still limited to medium-to-low-resource languages due to the lack of representative datasets. The tutorial offers an overview of this emerging research area. We explore the capabilities of LLMs in terms of their performance, zero- and few-shot settings, fine-tuning, instructions tuning, and close vs. open models with a special emphasis on low-resource settings. In addition to LLMs for standard NLP tasks, we will focus on speech and multimodality.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Firoj Alam , Shammur Absar Chowdhury , Sabri Boughorbel , Maram Hasanain

Topics

Artificial Intelligence > Learning Paradigms > Few-Shot Learning Natural Language Processing > Resources & Methods > Large Language Models Natural Language Processing > Resources & Methods > Multilingual NLP

Keywords

zero-shot learning few-shot learning multilingual nlp instruction tuning low-resource language large language model

Download PDF

Related papers

A Dataset for Metaphor Detection in Early Medieval Hebrew Poetry 2024

PRILoRA: Pruned and Rank-Increasing Low-Rank Adaptation 2024

Overview of the Hate Speech Detection in Turkish and Arabic Tweets (HSD-2Lang) Shared Task at CASE 2024 2024

Evaluating In-Context Learning for Computational Literary Studies: A Case Study Based on the Automatic Recognition of Knowledge Transfer in German Drama 2024

Selam@DravidianLangTech 2024:Identifying Hate Speech and Offensive Language 2024