Evaluating the Effect of Retrieval Augmentation on Social Biases
Abstract
AbstractRetrieval Augmented Generation (RAG) is a popular method for injecting up-to-date information into Large Language Model (LLM)-based Natural Language Generation (NLG) systems. While RAG can enhance factual accuracy, its effect on the social biases inherent in LLMs is not well understood. This paper systematically investigates how RAG modulates social biases across three languages (English, Japanese, and Chinese) and four categories (gender, race, age, and religion). By evaluating various generator LLMs on the BBQ benchmark, we analyse how document collections with controlled stereotypical content affect RAG outputs. We find that biases present in the retrieved documents are often significantly amplified in the generated texts, even when the base LLM itself has a low-level of intrinsic bias. These findings raise concerns about the social fairness of RAG systems, underscoring the urgent need for careful bias evaluation before real-world deployment.