2025
ACL
ACL 2025
Disentangling Biased Representations: A Causal Intervention Framework for Fairer NLP Models
Abstract
AbstractNatural language processing (NLP) systems often inadvertently encode and amplify social biases through entangled representations of demographic attributes and task-related attributes. To mitigate this, we propose a novel framework that combines causal analysis with practical intervention strategies. The method leverages attribute-specific prompting to isolate sensitive attributes while applying information-theoretic constraints to minimize spurious correlations. Experiments across six language models and two classification tasks demonstrate its effectiveness. We hope this work will provide the NLP community with a causal disentanglement perspective for achieving fairness in NLP systems.
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Knowledge & Reasoning and Machine Learning
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Topics
Artificial Intelligence > Core AI > Causal Inference
Machine Learning > Core Methods > Representation Learning
Machine Learning > Application Areas > Fairness
Knowledge & Reasoning > Reasoning > Causal Inference
Machine Learning > Learning Types > Representation Learning
Artificial Intelligence > Core AI > Fairness
Machine Learning > Learning Types > Fairness