Causal Confusion Reduction for Robust Multi-Domain Dialogue Policy

Mahdin Rohmatillah; Jen-Tzung Chien

2021 INTERSPEECH INTERSPEECH 2021

Causal Confusion Reduction for Robust Multi-Domain Dialogue Policy

Abstract

In the multi-domain dialogue system, dialog policy plays an important role since it determines the suitable actions based on the user’s goals. However, in many recent works, most of the dialogue optimizations, especially that use reinforcement learning (RL) methods, do not perform well. The main problem is that the initial step of optimization that involves the behavior cloning (BC) methods suffer from the causal confusion problem, which means that the agent misidentifies true cause of an expert action in current state. This paper proposes a novel method to improve the performance of BC method in dialogue system. Instead of only predicting correct action given a state from dataset, we introduce the auxiliary tasks to predict both of current belief state and recent user utterance in order to reduce causal confusion of the expert action in the dataset since those features are important in every dialog turn. Experiments on ConvLab-2 shows that, by using this method, all of RL based optimizations are improved. Furthermore, the agent based on the proximal policy optimization shows very significant improvement with the help of the proposed BC agent weights both in policy evaluation as well as in end-to-end system evaluation.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Reinforcement Learning

🐣 Hot Topic Early Bird — behavior cloning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Mahdin Rohmatillah , Jen-Tzung Chien

Topics

Artificial Intelligence > Core AI > Multi-Agent Systems Reinforcement Learning > Methods > Policy Learning

Keywords

reinforcement learning behavior cloning proximal policy optimization multi-domain dialogue dialogue policy causal confusion

Download PDF

Related papers

Energy-Friendly Keyword Spotting System Using Add-Based Convolution 2021

Dialogue Situation Recognition for Everyday Conversation Using Multimodal Information 2021

Using Games to Augment Corpora for Language Recognition and Confusability 2021

A Psychology-Driven Computational Analysis of Political Interviews 2021

The 2020 Personalized Voice Trigger Challenge: Open Datasets, Evaluation Metrics, Baseline System and Results 2021