GENDER1PERSON: Test Suite for Estimating Gender Bias of First-person Singular Forms

Maja Popović; Ekaterina Lapshinova-Koltunski

2025 EMNLP EMNLP 2025

GENDER1PERSON: Test Suite for Estimating Gender Bias of First-person Singular Forms

Abstract

AbstractThe gender1person test suite is designed to measure gender bias in translating singular first-person forms from English into two Slavic languages, Russian and Serbian. The test suite consists of 1,000 Amazon product reviews, uniformly distributed over 10 different product categories. Bias is measured through a gender score ranging from -100 (all reviews are feminine) to 100 (all reviews are masculine). The test suite shows that the majority of the systems participating in the WMT-2025 task for these two target languages prefer the masculine writer’s gender. There is no single system which is biased towards the feminine variant. Furthermore, for each language pair, there are seven systems that are considered balanced, having the gender scores between -10 and 10.Finally, the analysis of different products showed that the choice of the writer’s gender depends to a large extent on the product. Moreover, it is demonstrated that even the systems with overall balanced scores are actually biased, but in different ways for different product categories.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Interdisciplinary and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — gender score

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Maja Popović , Ekaterina Lapshinova-Koltunski

Topics

Machine Learning > Application Areas > Fairness Natural Language Processing > Applications > Machine Translation Interdisciplinary > Linguistics > Computational Linguistics Artificial Intelligence > Core AI > Fairness Machine Learning > Learning Types > Fairness

Keywords

natural language processing text classification machine translation gender bia slavic language test suite bias measurement first-person pronoun gender score

Download PDF

Related papers

Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework 2025

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing 2025

Model-based Large Language Model Customization as Service 2025

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration 2025

SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design 2025