2025
EMNLP
EMNLP 2025
Bias Analysis and Mitigation through Protected Attribute Detection and Regard Classification
Abstract
AbstractLarge language models (LLMs) acquire general linguistic knowledge from massive-scale pretraining. However, pretraining data mainly comprised of web-crawled texts contain undesirable social biases which can be perpetuated or even amplified by LLMs. In this study, we propose an efficient yet effective annotation pipeline to investigate social biases in the pretraining corpora. Our pipeline consists of protected attribute detection to identify diverse demographics, followed by regard classification to analyze the language polarity towards each attribute. Through our experiments, we demonstrate the effect of our bias analysis and mitigation measures, focusing on Common Crawl as the most representative pretraining corpus.
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Machine Learning and Natural Language Processing
🧭
Keyword Pioneer
— protected attribute detection
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Topics
Artificial Intelligence > Core AI > Responsible AI
Machine Learning > Application Areas > Fairness
Natural Language Processing > Resources & Methods > Large Language Models
Artificial Intelligence > Core AI > Large Language Models
Artificial Intelligence > Core AI > Fairness
Machine Learning > Learning Types > Fairness