SHROOM-INDElab at SemEval-2024 Task 6: Zero- and Few-Shot LLM-Based Classification for Hallucination Detection

Bradley Allen; Fina Polat; Paul Groth

2024 NAACL NAACL 2024

SHROOM-INDElab at SemEval-2024 Task 6: Zero- and Few-Shot LLM-Based Classification for Hallucination Detection

Abstract

AbstractWe describe the University of Amsterdam Intelligent Data Engineering Lab team’s entry for the SemEval-2024 Task 6 competition. The SHROOM-INDElab system builds on previous work on using prompt programming and in-context learning with large language models (LLMs) to build classifiers for hallucination detection, and extends that work through the incorporation of context-specific definition of task, role, and target concept, and automated generation of examples for use in a few-shot prompting approach. The resulting system achieved fourth-best and sixth-best performance in the model-agnostic track and model-aware tracks for Task 6, respectively, and evaluation using the validation sets showed that the system’s classification decisions were consistent with those of the crowdsourced human labelers. We further found that a zero-shot approach provided better accuracy than a few-shot approach using automatically generated examples. Code for the system described in this paper is available on Github.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Natural Language Processing

🧭 Keyword Pioneer — prompt programming

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Bradley Allen , Fina Polat , Paul Groth

Topics

Artificial Intelligence > Learning Paradigms > Few-Shot Learning Natural Language Processing > Resources & Methods > Large Language Models Artificial Intelligence > Learning Paradigms > Zero-Shot Learning

Keywords

zero-shot learning few-shot learning in-context learning hallucination detection large language model prompt programming

Download PDF

Related papers

Working Alliance Transformer for Psychotherapy Dialogue Classification 2024

Named Entity Recognition Under Domain Shift via Metric Learning for Life Sciences 2024

Assessing Logical Puzzle Solving in Large Language Models: Insights from a Minesweeper Case Study 2024

TelME: Teacher-leading Multimodal Fusion Network for Emotion Recognition in Conversation 2024

Extractive Summarization with Text Generator 2024