Propagating Uncertainty in Reinforcement Learning via Wasserstein Barycenters

Alberto Maria Metelli; Amarildo Likmeta; Marcello Restelli

2019 NIPS NeurIPS 2019

Propagating Uncertainty in Reinforcement Learning via Wasserstein Barycenters

Abstract

How does the uncertainty of the value function propagate when performing temporal difference learning? In this paper, we address this question by proposing a Bayesian framework in which we employ approximate posterior distributions to model the uncertainty of the value function and Wasserstein barycenters to propagate it across state-action pairs. Leveraging on these tools, we present an algorithm, Wasserstein Q-Learning (WQL), starting in the tabular case and then, we show how it can be extended to deal with continuous domains. Furthermore, we prove that, under mild assumptions, a slight variation of WQL enjoys desirable theoretical properties in the tabular setting. Finally, we present an experimental campaign to show the effectiveness of WQL on finite problems, compared to several RL algorithms, some of which are specifically designed for exploration, along with some preliminary results on Atari games.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Reinforcement Learning

🐣 Hot Topic Early Bird — temporal difference learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Alberto Maria Metelli , Amarildo Likmeta , Marcello Restelli

Topics

Artificial Intelligence > Bayesian & Probabilistic > Probabilistic Modeling Machine Learning > Optimization & Theory > Bayesian Inference Reinforcement Learning > Methods > Deep RL Machine Learning > Bayesian & Probabilistic > Bayesian Inference Machine Learning > Learning Types > Uncertainty Quantification

Keywords

wasserstein distance reinforcement learning temporal difference learning bayesian inference uncertainty quantification wasserstein barycenter

Download PDF

Related papers

Two Generator Game: Learning to Sample via Linear Goodness-of-Fit Test 2019

Metalearned Neural Memory 2019

Model Similarity Mitigates Test Set Overuse 2019

Continual Unsupervised Representation Learning 2019

Reinforcement Learning with Convex Constraints 2019