WikiBias as an Extrapolation Corpus for Bias Detection

K. Salas-Jimenez; Francisco Fernando Lopez-Ponce; Sergio-luis Ojeda-trueba; Gemma Bel-Enguix

2024 EMNLP EMNLP 2024

WikiBias as an Extrapolation Corpus for Bias Detection

Abstract

AbstractThis paper explores whether it is possible to train a machine learning model using Wikipedia data to detect subjectivity in sentences and generalize effectively to other domains. To achieve this, we performed experiments with the WikiBias corpus, the BABE corpus, and the CheckThat! Dataset. Various classical models for ML were tested, including Logistic Regression, SVC, and SVR, including characteristics such as Sentence Transformers similarity, probabilistic sentiment measures, and biased lexicons. Pre-trained models like DistilRoBERTa, as well as large language models like Gemma and GPT-4, were also tested for the same classification task.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

K. Salas-Jimenez , Francisco Fernando Lopez-Ponce , Sergio-luis Ojeda-trueba , Gemma Bel-Enguix

Topics

Artificial Intelligence > Core AI > Responsible AI Machine Learning > Core Methods > Classification Natural Language Processing > Applications > Text Classification

Keywords

transfer learning text classification bias detection sentence transformer subjectivity detection

Download PDF

Related papers

EmbodiedBERT: Cognitively Informed Metaphor Detection Incorporating Sensorimotor Information 2024

Mitigating Matthew Effect: Multi-Hypergraph Boosted Multi-Interest Self-Supervised Learning for Conversational Recommendation 2024

Learning to Extract Structured Entities Using Language Models 2024

Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis 2024

CSSL: Contrastive Self-Supervised Learning for Dependency Parsing on Relatively Free Word Ordered and Morphologically Rich Low Resource Languages 2024