ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models

Yuzhe Gu; Ziwei Ji; Wenwei Zhang; Chengqi Lyu; Dahua Lin; Kai Chen

2024 NIPS NeurIPS 2024

ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models

Abstract

Large language models (LLMs) exhibit hallucinations in long-form question-answering tasks across various domains and wide applications. Current hallucination detection and mitigation datasets are limited in domain and size, which struggle to scale due to prohibitive labor costs and insufficient reliability of existing hallucination annotators. To facilitate the scalable oversight of LLM hallucinations, this paper introduces an iterative self-training framework that simultaneously and progressively scales up the annotation dataset and improves the accuracy of the annotator. Based on the Expectation Maximization algorithm, in each iteration, the framework first applies an automatic hallucination annotation pipeline for a scaled dataset and then trains a more accurate annotator on the dataset. This new annotator is adopted in the annotation pipeline for the next iteration. Extensive experimental results demonstrate that the finally obtained hallucination annotator with only 7B parameters surpasses GPT-4 and obtains new state-of-the-art hallucination detection results on HaluEval and HalluQA by zero-shot inference. Such an annotator can not only evaluate the hallucination levels of various LLMs on the large-scale dataset but also help to mitigate the hallucination of LLMs generations, with the Natural Language Inference metric increasing from 25% to 37% on HaluEval.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Natural Language Processing

🧭 Keyword Pioneer — self-training framework

🐣 Hot Topic Early Bird — hallucination detection

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors

Yuzhe Gu , Ziwei Ji , Wenwei Zhang , Chengqi Lyu , Dahua Lin , Kai Chen

Topics

Artificial Intelligence > Core AI > AI Safety Machine Learning > Learning Types > Self-Supervised Learning Natural Language Processing > Applications > Question Answering Artificial Intelligence > Core AI > Large Language Models Natural Language Processing > Applications > Natural Language Inference Deep Learning > Models > Large Language Models Machine Learning > Optimization & Theory > Evaluation Deep Learning > Learning Types > Self-Supervised Learning

Keywords

natural language inference expectation maximization hallucination detection self-training framework iterative annotation long-form question-answering zero-shot inference large language model

Download PDF

Related papers

SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers 2024

Training for Stable Explanation for Free 2024

NeuralSolver: Learning Algorithms For Consistent and Efficient Extrapolation Across General Tasks 2024

Expectation Alignment: Handling Reward Misspecification in the Presence of Expectation Mismatch 2024

MicroAdam: Accurate Adaptive Optimization with Low Space Overhead and Provable Convergence 2024