DelucionQA: Detecting Hallucinations in Domain-specific Question Answering

Mobashir Sadat; Zhengyu Zhou; Lukas Lange; Jun Araki; Arsalan Gundroo; Bingqing Wang; Rakesh Menon; Md Parvez; Zhe Feng

2023 EMNLP EMNLP 2023

DelucionQA: Detecting Hallucinations in Domain-specific Question Answering

Abstract

AbstractHallucination is a well-known phenomenon in text generated by large language models (LLMs). The existence of hallucinatory responses is found in almost all application scenarios e.g., summarization, question-answering (QA) etc. For applications requiring high reliability (e.g., customer-facing assistants), the potential existence of hallucination in LLM-generated text is a critical problem. The amount of hallucination can be reduced by leveraging information retrieval to provide relevant background information to the LLM. However, LLMs can still generate hallucinatory content for various reasons (e.g., prioritizing its parametric knowledge over the context, failure to capture the relevant information from the context, etc.). Detecting hallucinations through automated methods is thus paramount. To facilitate research in this direction, we introduce a sophisticated dataset, DelucionQA, that captures hallucinations made by retrieval-augmented LLMs for a domain-specific QA task. Furthermore, we propose a set of hallucination detection methods to serve as baselines for future works from the research community. Analysis and case study are also provided to share valuable insights on hallucination phenomena in the target scenario.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Mobashir Sadat , Zhengyu Zhou , Lukas Lange , Jun Araki , Arsalan Gundroo , Bingqing Wang , Rakesh Menon , Md Parvez , Zhe Feng

Topics

Artificial Intelligence > Core AI > Interpretability Natural Language Processing > Applications > Question Answering

Keywords

natural language generation retrieval-augmented generation hallucination detection large language model domain-specific question answering

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023