Knowing What You Know: Calibrating Dialogue Belief State Distributions via Ensembles

Carel van Niekerk; Michael Heck; Christian Geishauser; Hsien-Chin Lin; Nurul Lubis; Marco Moresi; Milica Gasic

2020 EMNLP EMNLP 2020

Knowing What You Know: Calibrating Dialogue Belief State Distributions via Ensembles

Abstract

AbstractThe ability to accurately track what happens during a conversation is essential for the performance of a dialogue system. Current state-of-the-art multi-domain dialogue state trackers achieve just over 55% accuracy on the current go-to benchmark, which means that in almost every second dialogue turn they place full confidence in an incorrect dialogue state. Belief trackers, on the other hand, maintain a distribution over possible dialogue states. However, they lack in performance compared to dialogue state trackers, and do not produce well calibrated distributions. In this work we present state-of-the-art performance in calibration for multi-domain dialogue belief trackers using a calibrated ensemble of models. Our resulting dialogue belief tracker also outperforms previous dialogue belief tracking models in terms of accuracy.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Carel van Niekerk , Michael Heck , Christian Geishauser , Hsien-Chin Lin , Nurul Lubis , Marco Moresi , Milica Gasic

Topics

Machine Learning > Core Methods > Classification Machine Learning > Optimization & Theory > Bayesian Inference

Keywords

ensemble learning belief tracking state estimation dialogue system

Download PDF

Related papers

Fast semantic parsing with well-typedness guarantees 2020

Detecting Objectifying Language in Online Professor Reviews 2020

Analogous Process Structure Induction for Sub-event Sequence Prediction 2020

Aspect Sentiment Classification with Aspect-Specific Opinion Spans 2020

Robust and Interpretable Grounding of Spatial References with Relation Networks 2020