Fair Text Classification via Transferable Representations

Thibaud Leteno; Michaël Perrot; Charlotte Laclau; Antoine Gourru; Christophe Gravier

2025 JMLR JMLR 2025

Fair Text Classification via Transferable Representations

Abstract

Group fairness is a central research topic in text classification, where reaching fair treatment between sensitive groups (e.g., women and men) remains an open challenge. We propose an approach that extends the use of the Wasserstein Dependency Measure for learning unbiased neural text classifiers. Given the challenge of distinguishing fair from unfair information in a text encoder, we draw inspiration from adversarial training by inducing independence between representations learned for the target label and those for a sensitive attribute. We further show that domain adaptation can be efficiently leveraged to remove the need for access to the sensitive attributes in the data set we cure. We provide both theoretical and empirical evidence that our approach is well-founded. [abs] [ pdf ][ bib ] [ code ] © JMLR 2025. (edit, beta)

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Thibaud Leteno , Michaël Perrot , Charlotte Laclau , Antoine Gourru , Christophe Gravier

Topics

Artificial Intelligence > Learning Paradigms > Transfer Learning Machine Learning > Application Areas > Domain Adaptation Machine Learning > Application Areas > Fairness Natural Language Processing > Applications > Text Classification Machine Learning > Learning Types > Transfer Learning Deep Learning > Learning Types > Adversarial Learning Machine Learning > Learning Types > Fairness

Keywords

transfer learning domain adaptation text classification adversarial training group fairness

Download PDF

Related papers

On the Natural Gradient of the Evidence Lower Bound 2025

Four Axiomatic Characterizations of the Integrated Gradients Attribution Method 2025

Extending Temperature Scaling with Homogenizing Maps 2025

Ontolearn---A Framework for Large-scale OWL Class Expression Learning in Python 2025

An Axiomatic Definition of Hierarchical Clustering 2025