CSECU-DSG at SemEval-2021 Task 5: Leveraging Ensemble of Sequence Tagging Models for Toxic Spans Detection

Tashin Hossain; Jannatun Naim; Fareen Tasneem; Radiathun Tasnia; Abu Nowshed Chy

2021 ACL ACL 2021

CSECU-DSG at SemEval-2021 Task 5: Leveraging Ensemble of Sequence Tagging Models for Toxic Spans Detection

Abstract

AbstractThe upsurge of prolific blogging and microblogging platforms enabled the abusers to spread negativity and threats greater than ever. Detecting the toxic portions substantially aids to moderate or exclude the abusive parts for maintaining sound online platforms. This paper describes our participation in the SemEval 2021 toxic span detection task. The task requires detecting spans that convey toxic remarks from the given text. We explore an ensemble of sequence labeling models including the BiLSTM-CRF, spaCy NER model with custom toxic tags, and fine-tuned BERT model to identify the toxic spans. Finally, a majority voting ensemble method is used to determine the unified toxic spans. Experimental results depict the competitive performance of our model among the participants.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — bidirectional lstm crf

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Tashin Hossain , Jannatun Naim , Fareen Tasneem , Radiathun Tasnia , Abu Nowshed Chy

Topics

Natural Language Processing > Understanding > Named Entity Recognition Machine Learning > Learning Types > Ensemble Learning Machine Learning > Core Methods > Ensemble Methods Machine Learning > Core Methods > Sequence Labeling Deep Learning > Learning Types > Ensemble Learning Deep Learning > Learning Types > Sequence Labeling

Keywords

ensemble learning sequence labeling named entity recognition majority voting pretrained language model conditional random field bert fine-tuning bidirectional long short-term memory toxic span detection bidirectional lstm crf

Download PDF

Related papers

Out-of-Scope Intent Detection with Self-Supervision and Discriminative Training 2021

A Non-Autoregressive Edit-Based Approach to Controllable Text Simplification 2021

How Did This Get Funded?! Automatically Identifying Quirky Scientific Achievements 2021

Exploring Discourse Structures for Argument Impact Classification 2021

Language Embeddings for Typology and Cross-lingual Transfer Learning 2021