Hate-Alert@DravidianLangTech-EACL2021: Ensembling strategies for Transformer-based Offensive language Detection

Debjoy Saha; Naman Paharia; Debajit Chakraborty; Punyajoy Saha; Animesh Mukherjee

2021 EACL EACL 2021

Hate-Alert@DravidianLangTech-EACL2021: Ensembling strategies for Transformer-based Offensive language Detection

Abstract

AbstractSocial media often acts as breeding grounds for different forms of offensive content. For low resource languages like Tamil, the situation is more complex due to the poor performance of multilingual or language-specific models and lack of proper benchmark datasets. Based on this shared task “Offensive Language Identification in Dravidian Languages” at EACL 2021; we present an exhaustive exploration of different transformer models, We also provide a genetic algorithm technique for ensembling different models. Our ensembled models trained separately for each language secured the first position in Tamil, the second position in Kannada, and the first position in Malayalam sub-tasks. The models and codes are provided.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🐣 Hot Topic Early Bird — genetic algorithm

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Debjoy Saha , Naman Paharia , Debajit Chakraborty , Punyajoy Saha , Animesh Mukherjee

Topics

Machine Learning > Optimization & Theory > Optimization Natural Language Processing > Applications > Text Classification

Keywords

ensemble learning text classification offensive language detection model ensemble genetic algorithm transformer model

Download PDF

Related papers

Joint Coreference Resolution and Character Linking for Multiparty Conversation 2021

Progressively Pretrained Dense Corpus Index for Open-Domain Question Answering 2021

Crisscrossed Captions: Extended Intramodal and Intermodal Semantic Similarity Judgments for MS-COCO 2021

Representations for Question Answering from Documents with Tables and Text 2021

Gender and Racial Fairness in Depression Research using Social Media 2021