REFINER: Reasoning Feedback on Intermediate Representations

Debjit Paul; Mete Ismayilzada; Maxime Peyrard; Beatriz Borges; Antoine Bosselut; Robert West; Boi Faltings

2024 EACL EACL 2024

REFINER: Reasoning Feedback on Intermediate Representations

Abstract

AbstractLanguage models (LMs) have recently shown remarkable performance on reasoning tasks by explicitly generating intermediate inferences,e.g., chain-of-thought prompting. However, these intermediate inference steps may be inappropriate deductions from the initial contextand lead to incorrect final predictions. Here we introduce REFINER, a framework for finetuning LMs to explicitly generate intermediate reasoning steps while interacting with a critic model that provides automated feedback on the reasoning. Specifically, the critic provides structured feedback that the reasoning LM uses to iteratively improve its intermediate arguments. Empirical evaluations of REFINER on three diverse reasoning tasks show significant improvements over baseline LMs of comparable scale. Furthermore, when using GPT-3.5 or ChatGPT as the reasoner, the trained critic significantly improves reasoning without finetuning the reasoner. Finally, our critic model is trained without expensive human-in-the-loop data but can be substituted with humans at inference time.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Natural Language Processing

🧭 Keyword Pioneer — intermediate inference

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Debjit Paul , Mete Ismayilzada , Maxime Peyrard , Beatriz Borges , Antoine Bosselut , Robert West , Boi Faltings

Topics

Artificial Intelligence > Core AI > Foundation Models Artificial Intelligence > Core AI > Interpretability Natural Language Processing > Generation > Language Modeling

Keywords

language model chain-of-thought prompting intermediate inference

Download PDF

Related papers

A Dataset for Metaphor Detection in Early Medieval Hebrew Poetry 2024

PRILoRA: Pruned and Rank-Increasing Low-Rank Adaptation 2024

Overview of the Hate Speech Detection in Turkish and Arabic Tweets (HSD-2Lang) Shared Task at CASE 2024 2024

Evaluating In-Context Learning for Computational Literary Studies: A Case Study Based on the Automatic Recognition of Knowledge Transfer in German Drama 2024

Selam@DravidianLangTech 2024:Identifying Hate Speech and Offensive Language 2024