Enhancing Hallucination Detection through Perturbation-Based Synthetic Data Generation in System Responses

Dongxu Zhang; Varun Gangal; Barrett Lattimer; Yi Yang

2024 ACL ACL 2024

Enhancing Hallucination Detection through Perturbation-Based Synthetic Data Generation in System Responses

Abstract

AbstractDetecting hallucinations in large language model (LLM) outputs is pivotal, yet traditional fine-tuning for this classification task is impeded by the expensive and quickly outdated annotation process, especially across numerous vertical domains and in the face of rapid LLM advancements. In this study, we introduce an approach that automatically generates both faithful and hallucinated outputs by rewriting system responses. Experimental findings demonstrate that a T5-base model, fine-tuned on our generated dataset, surpasses state-of-the-art zero-shot detectors and existing synthetic generation methods in both accuracy and latency, indicating efficacy of our approach.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Dongxu Zhang , Varun Gangal , Barrett Lattimer , Yi Yang

Topics

Machine Learning > Core Methods > Classification Machine Learning > Learning Types > Zero-Shot Learning Deep Learning > Architectures > Transformers

Keywords

zero-shot learning text classification synthetic data generation hallucination detection large language model

Download PDF

Related papers

Reinforcement Learning-Driven LLM Agent for Automated Attacks on LLMs 2024

EtymoLink: A Structured English Etymology Dataset 2024

Turkish Delights: A Dataset on Turkish Euphemisms 2024

Subjectivity Detection in English News using Large Language Models 2024

Does DetectGPT Fully Utilize Perturbation? Bridging Selective Perturbation to Fine-tuned Contrastive Learning Detector would be Better 2024