ReflectiveRAG: Rethinking Adaptivity in Retrieval-Augmented Generation
Abstract
AbstractRetrieval-Augmented Generation (RAG) systems degrade sharply under extreme noise,where irrelevant or redundant passages dominate. Current methods-fixed top-k retrieval, cross-encoder reranking, or policy based iteration-depend on static heuristics orcostly reinforcement learning, failing to assess evidence sufficiency, detect subtle mismatches, or reduce redundancy, leading to hallucinations and poor grounding. We introduce ReflectiveRAG, a lightweight yet reasoning-driven architecture that enhances factual grounding through two complementary mechanisms: Self-Reflective Retrieval (SRR) and Contrastive Noise Removal (NR). SRR employs small language model as a decision controller that iteratively evaluates evidence sufficiency, enabling adaptive query reformulation withoutfixed schedules or policy training. NR further refines retrieved content via embedding-based contrastive filtering, enforcing semanticsparsity and removing redundant or tangential passages. Evaluated on WebQuestions, HotpotQA (distractor setting) and InternalQAwith 50M Common Crawl distractors, ReflectiveRAG achieves substantial gains over strong baselines-including DeepRAG-improving EMby +2.7 pp and F1 by +2.5 pp, while reducing evidence redundancy by 30.88% with only 18 ms additional latency. Ablation studies con-firm that SRR and NR jointly drive both factual accuracy and efficiency, validating our central claim that retrieval reasoning and contrastivefiltering can outperform large-scale policy optimization in RAG.