Better Late Than Never: Model-Agnostic Hallucination Post-Processing Framework Towards Clinical Text Summarization

Songda Li; Yunqi Zhang; Chunyuan Deng; Yake Niu; Hui Zhao

2024 ACL ACL 2024

Better Late Than Never: Model-Agnostic Hallucination Post-Processing Framework Towards Clinical Text Summarization

Abstract

AbstractClinical text summarization has proven successful in generating concise and coherent summaries. However, these summaries may include unintended text with hallucinations, which can mislead clinicians and patients. Existing methods for mitigating hallucinations can be categorized into task-specific and task-agnostic approaches. Task-specific methods lack versatility for real-world applicability. Meanwhile, task-agnostic methods are not model-agnostic, so they require retraining for different models, resulting in considerable computational costs. To address these challenges, we propose MEDAL, a model-agnostic framework designed to post-process medical hallucinations. MEDAL can seamlessly integrate with any medical summarization model, requiring no additional computational overhead. MEDAL comprises a medical infilling model and a hallucination correction model. The infilling model generates non-factual summaries with common errors to train the correction model. The correction model is incorporated with a self-examination mechanism to activate its cognitive capability. We conduct comprehensive experiments using 11 widely accepted metrics on 7 baseline models across 3 medical text summarization tasks. MEDAL demonstrates superior performance in correcting hallucinations when applied to summaries generated by pre-trained language models and large language models.

🌉 Interdisciplinary Bridge — Deep Learning and Healthcare & Medicine and Natural Language Processing

🧭 Keyword Pioneer — medical hallucination

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Songda Li , Yunqi Zhang , Chunyuan Deng , Yake Niu , Hui Zhao

Topics

Natural Language Processing > Generation > Summarization Healthcare & Medicine > Clinical > Clinical NLP Natural Language Processing > Applications > Summarization Deep Learning > Learning Types > Generative Models

Keywords

natural language generation text generation language model hallucination detection hallucination correction clinical text summarization medical hallucination

Download PDF

Related papers

Reinforcement Learning-Driven LLM Agent for Automated Attacks on LLMs 2024

EtymoLink: A Structured English Etymology Dataset 2024

Turkish Delights: A Dataset on Turkish Euphemisms 2024

Subjectivity Detection in English News using Large Language Models 2024

Does DetectGPT Fully Utilize Perturbation? Bridging Selective Perturbation to Fine-tuned Contrastive Learning Detector would be Better 2024