An Empirical Study on the Characteristics of Bias upon Context Length Variation for Bangla

Jayanta Sadhu; Ayan Khan; Abhik Bhattacharjee; Rifat Shahriyar

2024 ACL ACL 2024

An Empirical Study on the Characteristics of Bias upon Context Length Variation for Bangla

Abstract

AbstractPretrained language models inherently exhibit various social biases, prompting a crucial examination of their social impact across various linguistic contexts due to their widespread usage. Previous studies have provided numerous methods for intrinsic bias measurements, predominantly focused on high-resource languages. In this work, we aim to extend these investigations to Bangla, a low-resource language. Specifically, in this study, we (1) create a dataset for intrinsic gender bias measurement in Bangla, (2) discuss necessary adaptations to apply existing bias measurement methods for Bangla, and (3) examine the impact of context length variation on bias measurement, a factor that has been overlooked in previous studies. Through our experiments, we demonstrate a clear dependency of bias metrics on context length, highlighting the need for nuanced considerations in Bangla bias analysis. We consider our work as a stepping stone for bias measurement in the Bangla Language and make all of our resources publicly available to support future research.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Interdisciplinary and Machine Learning and Natural Language Processing

🐣 Hot Topic Early Bird — bangla language

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Jayanta Sadhu , Ayan Khan , Abhik Bhattacharjee , Rifat Shahriyar

Topics

Machine Learning > Application Areas > Fairness Natural Language Processing > Resources & Methods > Multilingual NLP Interdisciplinary > Linguistics Artificial Intelligence > Core AI > Fairness

Keywords

low-resource language language model context length pretrained language model gender bia bias evaluation bias measurement bangla language gender bias measurement

Download PDF

Related papers

Reinforcement Learning-Driven LLM Agent for Automated Attacks on LLMs 2024

EtymoLink: A Structured English Etymology Dataset 2024

Turkish Delights: A Dataset on Turkish Euphemisms 2024

Subjectivity Detection in English News using Large Language Models 2024

Does DetectGPT Fully Utilize Perturbation? Bridging Selective Perturbation to Fine-tuned Contrastive Learning Detector would be Better 2024