2025 NAACL NAACL 2025

codecrackers@DravidianLangTech 2025: Sentiment Classification in Tamil and Tulu Code-Mixed Social Media Text Using Machine Learning

Abstract

AbstractSentiment analysis of code-mixed Dravidian languages has become a major area of concern with increasing volumes of multilingual and code-mixed information across social media. This paper presents the “Seventh Shared Task on Sentiment Analysis in Code-mixed Tamil and Tulu”, which was held as part of DravidianLangTech (NAACL-2025). However, sentiment analysis for code-mixed Dravidian languages has received little attention due to challenges such as class imbalance, small sample size, and the informal nature of the code-mixed text. This study applied an SVM-based approach for the sentiment classification of both Tamil and Tulu languages. The SVM model achieved competitive macro-average F1 scores of 0.54 for Tulu and 0.438 for Tamil, demonstrating that traditional machine learning methods can effectively tackle sentiment categorization in code-mixed languages under low-resource settings.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio