SKAM at SemEval-2023 Task 10: Linguistic Feature Integration and Continuous Pretraining for Online Sexism Detection and Classification

Murali Manohar Kondragunta; Amber Chen; Karlo Slot; Sanne Weering; Tommaso Caselli

2023 SEMEVAL SemEval 2023

SKAM at SemEval-2023 Task 10: Linguistic Feature Integration and Continuous Pretraining for Online Sexism Detection and Classification

Abstract

AbstractSexism has been prevalent online. In this paper, we explored the effect of explicit linguistic features and continuous pretraining on the performance of pretrained language models in sexism detection. While adding linguistic features did not improve the performance of the model, continuous pretraining did slightly boost the performance of the model in Task B from a mean macro-F1 score of 0.6156 to 0.6246. The best mean macro-F1 score in Task A was achieved by a finetuned HateBERT model using regular pretraining (0.8331). We observed that the linguistic features did not improve the model’s performance. At the same time, continuous pretraining proved beneficial only for nuanced downstream tasks like Task-B.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Murali Manohar Kondragunta , Amber Chen , Karlo Slot , Sanne Weering , Tommaso Caselli

Topics

Machine Learning > Learning Types > Self-Supervised Learning Natural Language Processing > Applications > Text Classification Machine Learning > Learning Types > Transfer Learning Natural Language Processing > Applications > Sentiment Analysis Machine Learning > Learning Types > Classification

Keywords

transfer learning text classification pretrained language model linguistic feature hate speech detection sexism detection continuous pretraining

Download PDF

Related papers

Coco at SemEval-2023 Task 10: Explainable Detection of Online Sexism 2023

ZBL2W at SemEval-2023 Task 9: A Multilingual Fine-tuning Model with Data Augmentation for Tweet Intimacy Analysis 2023

MLModeler5 at SemEval-2023 Task 3: Detecting the Category and the Framing Techniques in Online News in a Multi-lingual Setup 2023

OPI at SemEval-2023 Task 9: A Simple But Effective Approach to Multilingual Tweet Intimacy Analysis 2023

NLP-LISAC at SemEval-2023 Task 12: Sentiment Analysis for Tweets expressed in African languages via Transformer-based Models 2023