Dynamic Dialogue Policy for Continual Reinforcement Learning

Christian Geishauser; Carel van Niekerk; Hsien-Chin Lin; Nurul Lubis; Michael Heck; Shutong Feng; Milica Gasic

2022 COLING COLING 2022

Dynamic Dialogue Policy for Continual Reinforcement Learning

Abstract

AbstractContinual learning is one of the key components of human learning and a necessary requirement of artificial intelligence. As dialogue can potentially span infinitely many topics and tasks, a task-oriented dialogue system must have the capability to continually learn, dynamically adapting to new challenges while preserving the knowledge it already acquired. Despite the importance, continual reinforcement learning of the dialogue policy has remained largely unaddressed. The lack of a framework with training protocols, baseline models and suitable metrics, has so far hindered research in this direction. In this work we fill precisely this gap, enabling research in dialogue policy optimisation to go from static to dynamic learning. We provide a continual learning algorithm, baseline architectures and metrics for assessing continual learning models. Moreover, we propose the dynamic dialogue policy transformer (DDPT), a novel dynamic architecture that can integrate new knowledge seamlessly, is capable of handling large state spaces and obtains significant zero-shot performance when being exposed to unseen domains, without any growth in network parameter size. We validate the strengths of DDPT in simulation with two user simulators as well as with humans.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing and Reinforcement Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Christian Geishauser , Carel van Niekerk , Hsien-Chin Lin , Nurul Lubis , Michael Heck , Shutong Feng , Milica Gasic

Topics

Artificial Intelligence > Core AI > Agent Systems Machine Learning > Learning Types > Continual Learning Reinforcement Learning > Methods > Deep RL Reinforcement Learning > Methods > Policy Learning Natural Language Processing > Applications > Dialogue Systems Artificial Intelligence > Core AI > Language

Keywords

reinforcement learning continual learning zero-shot learning task-oriented dialogue dialogue policy dynamic architecture zero-shot performance

Download PDF

Related papers

MulZDG: Multilingual Code-Switching Framework for Zero-shot Dialogue Generation 2022

The Role of Context and Uncertainty in Shallow Discourse Parsing 2022

SelfMix: Robust Learning against Textual Label Noise with Self-Mixup Training 2022

Complicate Then Simplify: A Novel Way to Explore Pre-trained Models for Text Classification 2022

Repo4QA: Answering Coding Questions via Dense Retrieval on GitHub Repositories 2022