2024 NAACL NAACL 2024

NootNoot At SemEval-2024 Task 6: Hallucinations and Related Observable Overgeneration Mistakes Detection

Abstract

AbstractSemantic hallucinations in neural language gen-eration systems pose a significant challenge tothe reliability and accuracy of natural languageprocessing applications. Current neural mod-els often produce fluent but incorrect outputs,undermining the usefulness of generated text.In this study, we address the task of detectingsemantic hallucinations through the SHROOM(Semantic Hallucinations Real Or Mistakes)dataset, encompassing data from diverse NLGtasks such as definition modeling, machinetranslation, and paraphrase generation. We in-vestigate three methodologies: fine-tuning onlabelled training data, fine-tuning on labelledvalidation data, and a zero-shot approach usingthe Mixtral 8x7b instruct model. Our resultsdemonstrate the effectiveness of these method-ologies in identifying semantic hallucinations,with the zero-shot approach showing compet-itive performance without additional training.Our findings highlight the importance of robustdetection mechanisms for ensuring the accu-racy and reliability of neural language genera-tion systems.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing
🧭 Keyword Pioneer — semantic hallucination
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio