When Life Gives You Lemons, Make Cherryade: Converting Feedback from Bad Responses into Good Labels

Weiyan Shi; Emily Dinan; Kurt Shuster; Jason Weston; Jing Xu

2024 NAACL NAACL 2024

When Life Gives You Lemons, Make Cherryade: Converting Feedback from Bad Responses into Good Labels

Abstract

AbstractDeployed dialogue agents have the potential to integrate human feedback to continuously improve themselves. However, humans may not always provide explicit signals when the chatbot makes mistakes during interactions. In this work, we propose Juicer, a framework to make use of both binary and free-form textual human feedback. It works by: (i) extending sparse binary feedback by training a satisfaction classifier to label the unlabeled data; and (ii) training a reply corrector to map the bad replies to good ones. We find that augmenting training with model-corrected replies improves the final dialogue model, and we can further improve performance by using both positive and negative replies through the recently proposed Director model.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🧭 Keyword Pioneer — reply correction

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Weiyan Shi , Emily Dinan , Kurt Shuster , Jason Weston , Jing Xu

Topics

Artificial Intelligence > Core AI > Human-AI Interaction Machine Learning > Learning Types > Weakly Supervised Learning Machine Learning > Application Areas > Data Augmentation

Keywords

text classification weakly supervised learning human feedback dialogue system reply correction

Download PDF

Related papers

Working Alliance Transformer for Psychotherapy Dialogue Classification 2024

Named Entity Recognition Under Domain Shift via Metric Learning for Life Sciences 2024

Assessing Logical Puzzle Solving in Large Language Models: Insights from a Minesweeper Case Study 2024

TelME: Teacher-leading Multimodal Fusion Network for Emotion Recognition in Conversation 2024

Extractive Summarization with Text Generator 2024