LLMCrit: Teaching Large Language Models to Use Criteria

Weizhe Yuan; Pengfei Liu; Matthias Gallé

2024 ACL ACL 2024

LLMCrit: Teaching Large Language Models to Use Criteria

Abstract

AbstractHumans follow criteria when they execute tasks, and these criteria are directly used to assess the quality of task completion. Therefore, having models learn to use criteria to provide feedback can help humans or models to perform tasks better. However, current research in this area tends to consider only a limited number of criteria, or only a limited number of quality assessment aspects. To fill this gap, we propose a general framework that enables large language models (LLMs) to use comprehensive criteria for a task in delivering natural language feedback on task execution. In particular, we present a model-in-the-loop framework that semi-automatically derives criteria from collected guidelines for different writing tasks and constructs in-context demonstrations for each criterion. We choose three tasks from real-world scenarios to operationalize this idea: paper introduction writing, Python code writing, and Reddit post writing, and evaluate our feedback generation framework using different LLMs. The results reveal the fine-grained effects of adding criteria and demonstrations and provide valuable guidance on how to teach LLMs to use criteria more effectively.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — writing task

🐣 Hot Topic Early Bird — quality assessment

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Weizhe Yuan , Pengfei Liu , Matthias Gallé

Topics

Natural Language Processing > Generation > Text Generation Artificial Intelligence > Core AI > Large Language Models Natural Language Processing > Resources & Methods > Language Modeling Machine Learning > Learning Types > In-Context Learning Artificial Intelligence > Core AI > Natural Language Generation

Keywords

in-context learning feedback generation instruction tuning task execution quality assessment writing task in-context demonstration large language model natural language feedback criteria-based evaluation criterion-based feedback criteria-based feedback

Download PDF

Related papers

Reinforcement Learning-Driven LLM Agent for Automated Attacks on LLMs 2024

EtymoLink: A Structured English Etymology Dataset 2024

Turkish Delights: A Dataset on Turkish Euphemisms 2024

Subjectivity Detection in English News using Large Language Models 2024

Does DetectGPT Fully Utilize Perturbation? Bridging Selective Perturbation to Fine-tuned Contrastive Learning Detector would be Better 2024