Uplift-RAG: Uplift-Driven Knowledge Preference Alignment for Retrieval-Augmented Generation

Changle Qu; Sunhao Dai; Hengyi Cai; Yiyang Cheng; Jun Xu; Shuaiqiang Wang; Dawei Yin

2025 EMNLP EMNLP 2025

Uplift-RAG: Uplift-Driven Knowledge Preference Alignment for Retrieval-Augmented Generation

Abstract

AbstractRetrieval-augmented generation (RAG) has proven effective in enhancing the knowledge coverage of large language models (LLMs) and mitigating hallucinations by incorporating external retrieved documents. However, documents deemed relevant by the retriever are not necessarily helpful for answer generation, and including misleading information can even degrade performance. Existing efforts to estimate document utility often rely on the downstream generation performance, which conflates the influence of external documents with the intrinsic knowledge of the LLM, thereby obscuring the actual contribution of the retrieved content. To address this, this paper proposes Uplit-RAG, a uplift-driven knowledge preference alignment framework for RAG. Specifically, we first propose an uplift-based definition of document utility that quantifies each document’s marginal benefit over the LLM’s internal knowledge. We then optimize the reranker with three alignment objectives to identify and prioritize documents based on their uplift. This enables dynamic selection of documents that address the LLM’s knowledge gaps, going beyond fixed top-k selection, while reducing reference redundancy and the computational overhead of the LLM’s input. Extensive experiments demonstrate the effectiveness of Uplift-RAG.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Changle Qu , Sunhao Dai , Hengyi Cai , Yiyang Cheng , Jun Xu , Shuaiqiang Wang , Dawei Yin

Topics

Machine Learning > Core Methods > Representation Learning Natural Language Processing > Applications > Information Retrieval Natural Language Processing > Resources & Methods > Large Language Models

Keywords

knowledge distillation document retrieval retrieval-augmented generation knowledge alignment uplift modeling

Download PDF

Related papers

Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework 2025

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing 2025

Model-based Large Language Model Customization as Service 2025

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration 2025

SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design 2025