NTA at SemEval-2025 Task 11: Enhanced Multilingual Textual Multi-label Emotion Detection via Integrated Augmentation Learning

Nguyen Pham Hoang Le; An Nguyen Tran Khuong; Tram Nguyen Thi Ngoc; Thin Dang Van

2025 SEMEVAL SemEval 2025

NTA at SemEval-2025 Task 11: Enhanced Multilingual Textual Multi-label Emotion Detection via Integrated Augmentation Learning

Abstract

AbstractEmotion detection in text is crucial for various applications, but progress, especially in multi-label scenarios, is often hampered by data scarcity, particularly for low-resource languages like Emakhuwa and Tigrinya. This lack of data limits model performance and generalizability. To address this, the NTA team developed a system for SemEval-2025 Task 11, leveraging data augmentation techniques: swap, deletion, oversampling, emotion-focused synonym insertion and synonym replacement to enhance baseline models for multilingual textual multi-label emotion detection. Our proposed system achieved significantly higher macro F1-scores compared to the baseline across multiple languages, demonstrating a robust approach to tackling data scarcity. This resulted in a 17th place overall ranking on the private leaderboard, and remarkably, we achieved the highest score and became the winner in Tigrinya language, demonstrating the effectiveness of our approach in a low-resource setting.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio