Hallucination Detection for Grounded Instruction Generation

Lingjun Zhao; Khanh Nguyen; Hal Daume III

2023 EMNLP EMNLP 2023

Hallucination Detection for Grounded Instruction Generation

Abstract

AbstractWe investigate the problem of generating instructions to guide humans to navigate in simulated residential environments. A major issue with current models is hallucination: they generate references to actions or objects that are inconsistent with what a human follower would perform or encounter along the described path. We develop a model that detects these hallucinated references by adopting a model pre-trained on a large corpus of image-text pairs, and fine-tuning it with a contrastive loss that separates correct instructions from instructions containing synthesized hallucinations. Our final model outperforms several baselines, including using word probability estimated by the instruction-generation model, and supervised models based on LSTM and Transformer.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Deep Learning and Machine Learning

🧭 Keyword Pioneer — grounded instruction generation

🐣 Hot Topic Early Bird — image-text alignment

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Lingjun Zhao , Khanh Nguyen , Hal Daume III

Topics

Artificial Intelligence > Core AI > Multimodal Learning Machine Learning > Learning Types > Contrastive Learning Computer Vision > Analysis > Scene Understanding Artificial Intelligence > Core AI > Reasoning Computer Vision > Core AI > Multimodal Learning Machine Learning > Learning Types > Multi-Modal Learning Artificial Intelligence > Core AI > Computer Vision Deep Learning > Learning Types > Contrastive Learning

Keywords

contrastive learning multimodal learning vision-language model hallucination detection image-text alignment instruction generation navigation instruction grounded instruction generation

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023