Knowledge-Centric Hallucination Detection

Xiangkun Hu; Dongyu Ru; Lin Qiu; Qipeng Guo; Tianhang Zhang; Yang Xu; Yun Luo; Pengfei Liu; Yue Zhang; Zheng Zhang

2024 EMNLP EMNLP 2024

Knowledge-Centric Hallucination Detection

Abstract

AbstractLarge Language Models (LLMs) have shown impressive capabilities but also a concerning tendency to hallucinate. This paper presents RefChecker, a framework that introduces claim-triplets to represent claims in LLM responses, aiming to detect fine-grained hallucinations. In RefChecker, an extractor generates claim-triplets from a response, which are then evaluated by a checker against a reference. We delineate three task settings: Zero, Noisy and Accurate Context, to reflect various real-world use cases. We curated a benchmark spanning various NLP tasks and annotated 11k claim-triplets from 2.1k responses by seven LLMs. RefChecker supports both proprietary and open-source models as the extractor and checker. Experiments demonstrate that claim-triplets enable superior hallucination detection, compared to other granularities such as response, sentence and sub-sentence level claims. RefChecker outperforms prior methods by 18.2 to 27.2 points on our benchmark and the checking results of RefChecker are strongly aligned with human judgments.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Natural Language Processing

🧭 Keyword Pioneer — reference checking

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio

Authors

Xiangkun Hu , Dongyu Ru , Lin Qiu , Qipeng Guo , Tianhang Zhang , Yang Xu , Yun Luo , Pengfei Liu , Yue Zhang , Zheng Zhang

Topics

Natural Language Processing > Applications > Fact-Checking Natural Language Processing > Resources & Methods > Large Language Models Artificial Intelligence > Core AI > Large Language Models

Keywords

hallucination detection fine-grained evaluation text quality assessment reference verification reference checking

Download PDF

Related papers

EmbodiedBERT: Cognitively Informed Metaphor Detection Incorporating Sensorimotor Information 2024

Mitigating Matthew Effect: Multi-Hypergraph Boosted Multi-Interest Self-Supervised Learning for Conversational Recommendation 2024

Learning to Extract Structured Entities Using Language Models 2024

Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis 2024

CSSL: Contrastive Self-Supervised Learning for Dependency Parsing on Relatively Free Word Ordered and Morphologically Rich Low Resource Languages 2024