UMUTeam at SemEval-2024 Task 6: Leveraging Zero-Shot Learning for Detecting Hallucinations and Related Observable Overgeneration Mistakes

Ronghao Pan; José Antonio García-Díaz; Tomás Bernal-beltrán; Rafael Valencia-García

2024 NAACL NAACL 2024

UMUTeam at SemEval-2024 Task 6: Leveraging Zero-Shot Learning for Detecting Hallucinations and Related Observable Overgeneration Mistakes

Abstract

AbstractIn these working notes we describe the UMUTeam’s participation in SemEval-2024 shared task 6, which aims at detecting grammatically correct output of Natural Language Generation with incorrect semantic information in two different setups: model-aware and model-agnostic tracks. The task is consists of three subtasks with different model setups. Our approach is based on exploiting the zero-shot classification capability of the Large Language Models LLaMa-2, Tulu and Mistral, through prompt engineering. Our system ranked eighteenth in the model-aware setup with an accuracy of 78.4% and 29th in the model-agnostic setup with an accuracy of 76.9333%.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ronghao Pan , José Antonio García-Díaz , Tomás Bernal-beltrán , Rafael Valencia-García

Topics

Artificial Intelligence > Core AI > Interpretability Artificial Intelligence > Core AI > Responsible AI Machine Learning > Learning Types > Zero-Shot Learning Natural Language Processing > Generation > Text Generation Natural Language Processing > Applications > Fact-Checking Natural Language Processing > Applications > Text Classification Natural Language Processing > Resources & Methods > Large Language Models Artificial Intelligence > Core AI > Large Language Models

Keywords

zero-shot learning text classification natural language generation prompt engineering text generation hallucination detection large language model semantic verification

Download PDF

Related papers

Working Alliance Transformer for Psychotherapy Dialogue Classification 2024

Named Entity Recognition Under Domain Shift via Metric Learning for Life Sciences 2024

Assessing Logical Puzzle Solving in Large Language Models: Insights from a Minesweeper Case Study 2024

TelME: Teacher-leading Multimodal Fusion Network for Emotion Recognition in Conversation 2024

Extractive Summarization with Text Generator 2024