Predictive Engagement: An Efficient Metric for Automatic Evaluation of Open-Domain Dialogue Systems

Sarik Ghazarian; Ralph Weischedel; Aram Galstyan; Nanyun Peng

2020 AAAI AAAI 2020

Predictive Engagement: An Efficient Metric for Automatic Evaluation of Open-Domain Dialogue Systems

Abstract

Abstract User engagement is a critical metric for evaluating the quality of open-domain dialogue systems. Prior work has focused on conversation-level engagement by using heuristically constructed features such as the number of turns and the total time of the conversation. In this paper, we investigate the possibility and efficacy of estimating utterance-level engagement and define a novel metric, predictive engagement, for automatic evaluation of open-domain dialogue systems. Our experiments demonstrate that (1) human annotators have high agreement on assessing utterance-level engagement scores; (2) conversation-level engagement scores can be predicted from properly aggregated utterance-level engagement scores. Furthermore, we show that the utterance-level engagement scores can be learned from data. These scores can be incorporated into automatic evaluation metrics for open-domain dialogue systems to improve the correlation with human judgements. This suggests that predictive engagement can be used as a real-time feedback for training better dialogue models.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — predictive engagement

🐣 Hot Topic Early Bird — human annotation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors

Sarik Ghazarian , Ralph Weischedel , Aram Galstyan , Nanyun Peng

Topics

Natural Language Processing > Generation > Dialogue Systems Machine Learning > Learning Types > Evaluation

Keywords

human annotation dialogue evaluation dialogue system evaluation open-domain dialogue automatic evaluation metric engagement prediction predictive engagement utterance-level engagement

Download PDF

Related papers

Enhancing Pointer Network for Sentence Ordering with Pairwise Ordering Predictions 2020

CopyMTL: Copy Mechanism for Joint Extraction of Entities and Relations with Multi-Task Learning 2020

Neural Simile Recognition with Cyclic Multitask Learning and Local Attention 2020

Being Optimistic to Be Conservative: Quickly Learning a CVaR Policy 2020

Multi-Point Semantic Representation for Intent Classification 2020