Improving Contextual Query Rewrite for Conversational AI Agents through User-preference Feedback Learning

Zhongkai Sun; Yingxue Zhou; Jie Hao; Xing Fan; Yanbin Lu; Chengyuan Ma; Wei Shen; Chenlei Guo

2023 EMNLP EMNLP 2023

Improving Contextual Query Rewrite for Conversational AI Agents through User-preference Feedback Learning

Abstract

AbstractContextual query rewriting (CQR) is a crucial component in Conversational AI agents, leveraging the contextual information from previous user-agent conversations to improve the comprehension of current user intent. However, traditional CQR methods often concentrate on supervised fine-tuning only, neglecting the opportunities to learn from user feedback to align with user preferences. Inspired by recent advances in learning from human feedback (LHF), this paper proposes a novel Preference Aligned Contextual Query Rewriting (PA-CQR) framework to enhance the CQR model’s capability in generating user preference-aligned rewrites. This paper also investigates the efficacy of various state-of-the-art feedback learning algorithms on the CQR task, and proposes a novel Dynamic Direct Preference Optimization (Dynamic DPO) algorithm to better adapt the DPO algorithm to large-scale CQR training. Experiments on large-scale real-world CQR data set demonstrate the superiority of the proposed PA-CQR framework and the Dynamic DPO.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — contextual query rewriting

🐣 Hot Topic Early Bird — direct preference optimization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio

Authors

Zhongkai Sun , Yingxue Zhou , Jie Hao , Xing Fan , Yanbin Lu , Chengyuan Ma , Wei Shen , Chenlei Guo

Topics

Machine Learning > Learning Types > Self-Supervised Learning Natural Language Processing > Generation > Dialogue Systems

Keywords

direct preference optimization conversational ai feedback learning contextual query rewriting

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023