Randomized Prior Functions for Deep Reinforcement Learning

Ian Osband; John Aslanides; Albin Cassirer

2018 NIPS NeurIPS 2018

Randomized Prior Functions for Deep Reinforcement Learning

Abstract

Dealing with uncertainty is essential for efficient reinforcement learning. There is a growing literature on uncertainty estimation for deep learning from fixed datasets, but many of the most popular approaches are poorly-suited to sequential decision problems. Other methods, such as bootstrap sampling, have no mechanism for uncertainty that does not come from the observed data. We highlight why this can be a crucial shortcoming and propose a simple remedy through addition of a randomized untrainable `prior' network to each ensemble member. We prove that this approach is efficient with linear representations, provide simple illustrations of its efficacy with nonlinear representations and show that this approach scales to large-scale problems far better than previous attempts.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Reinforcement Learning

🧭 Keyword Pioneer — bootstrap sampling

🐣 Hot Topic Early Bird — uncertainty estimation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ian Osband , John Aslanides , Albin Cassirer

Topics

Artificial Intelligence > Bayesian & Probabilistic > Probabilistic Modeling Reinforcement Learning > Methods > Deep RL Artificial Intelligence > Bayesian & Probabilistic > Bayesian Inference

Keywords

deep reinforcement learning ensemble method uncertainty estimation bayesian exploration bootstrap sampling prior network randomized prior

Download PDF

Related papers

Maximum Causal Tsallis Entropy Imitation Learning 2018

Recurrent World Models Facilitate Policy Evolution 2018

Bandit Learning in Concave N-Person Games 2018

Algorithmic Assurance: An Active Approach to Algorithmic Testing using Bayesian Optimisation 2018

PAC-Bayes bounds for stable algorithms with instance-dependent priors 2018