1-800-SHARED-TASKS@NLU of Devanagari Script Languages 2025: Detection of Language, Hate Speech, and Targets using LLMs

Jebish Purbey; Siddartha Pullakhandam; Kanwal Mehreen; Muhammad Arham; Drishti Sharma; Ashay Srivastava; Ram Mohan Rao Kadiyala

2025 COLING COLING 2025

1-800-SHARED-TASKS@NLU of Devanagari Script Languages 2025: Detection of Language, Hate Speech, and Targets using LLMs

Abstract

AbstractThis paper presents a detailed system description of our entry for the CHiPSAL 2025 challenge, focusing on language detection, hate speech identification, and target detection in Devanagari script languages. We experimented with a combination of large language models and their ensembles, including MuRIL, IndicBERT, and Gemma-2, and leveraged unique techniques like focal loss to address challenges in the natural understanding of Devanagari languages, such as multilingual processing and class imbalance. Our approach achieved competitive results across all tasks: F1 of 0.9980, 0.7652, and 0.6804 for Sub-tasks A, B, and C respectively. This work provides insights into the effectiveness of transformer models in tasks with domain-specific and linguistic challenges, as well as areas for potential improvement in future iterations.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Jebish Purbey , Siddartha Pullakhandam , Kanwal Mehreen , Muhammad Arham , Drishti Sharma , Ashay Srivastava , Ram Mohan Rao Kadiyala

Topics

Natural Language Processing > Applications > Text Classification Natural Language Processing > Resources & Methods > Large Language Models Natural Language Processing > Applications > Sentiment Analysis Natural Language Processing > Applications > Named Entity Recognition

Keywords

focal loss hate speech detection devanagari script language detection target detection large language model transformer model

Download PDF

Related papers

Navigating Dialectal Bias and Ethical Complexities in Levantine Arabic Hate Speech Detection 2025

TaCIE: Enhancing Instruction Comprehension in Large Language Models through Task-Centred Instruction Evolution 2025

Positive Text Reframing under Multi-strategy Optimization 2025

RAM2C: A Liberal Arts Educational Chatbot based on Retrieval-augmented Multi-role Multi-expert Collaboration 2025

Two-stage Incomplete Utterance Rewriting on Editing Operation 2025