Gradient-guided Attention Map Editing: Towards Efficient Contextual Hallucination Mitigation

Yu Wang; Jiaxin Zhang; Xiang Gao; Wendi Cui; Peng Li; Kamalika Das

2025 NAACL NAACL 2025

Gradient-guided Attention Map Editing: Towards Efficient Contextual Hallucination Mitigation

Abstract

AbstractIn tasks such as summarization and open-book question answering (QA), Large Language Models (LLMs) frequently experience “contextual hallucination”, where they generate irrelevant or incorrect responses despite having access to accurate information in the input. This issue often stems from the models’ propensity to prioritize self-generated content over input context, leading to a disregard for pertinent details. To address this challenge, we introduce, Guided Attention Map Editing (GAME), an innovative approach that dynamically adjusts attention maps to enhance contextual relevance. During inference, GAME employs a trained classifier to identify attention maps likely to induce hallucinations and implements targeted interventions. These interventions, guided by gradient-informed “edit directions”, strategically redistribute attention weights across various heads to efficiently mitigate hallucination. Extensive evaluations on challenging summarization and open-book QA tasks demonstrate that GAME consistently and significantly reduces hallucinations across diverse open-source models, thereby improving the reliability and applicability of LLMs.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🧭 Keyword Pioneer — gradient-guided editing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio