Bayesian Reinforcement Learning via Deep, Sparse Sampling

Divya Grover; Debabrota Basu; Christos Dimitrakakis

2020 AISTATS AISTATS 2020

Bayesian Reinforcement Learning via Deep, Sparse Sampling

Abstract

We address the problem of Bayesian reinforcement learning using efficient model-based online planning. We propose an optimism-free Bayes-adaptive algorithm to induce deeper and sparser exploration with a theoretical bound on its performance relative to the Bayes optimal as well as lower computational complexity. The main novelty is the use of a candidate policy generator, to generate long-term options in the planning tree (over beliefs), which allows us to create much sparser and deeper trees. Experimental results on different environments show that in comparison to the state-of-the-art, our algorithm is both computationally more efficient, and obtains significantly higher reward over time in discrete environments.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Reinforcement Learning

🧭 Keyword Pioneer — model-based online planning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors

Divya Grover , Debabrota Basu , Christos Dimitrakakis

Topics

Artificial Intelligence > Bayesian & Probabilistic > Bayesian Learning Reinforcement Learning > Methods > Deep RL Reinforcement Learning > Applications > Value Iteration Machine Learning > Bayesian & Probabilistic > Bayesian Inference Machine Learning > Learning Types > Uncertainty Quantification

Keywords

model-based planning bayesian reinforcement learning computational efficiency belief state online planning performance bound model-based online planning sparse exploration

Download PDF

Related papers

Stretching the Effectiveness of MLE from Accuracy to Bias for Pairwise Comparisons 2020

Fast and Accurate Ranking Regression 2020

Nonparametric Sequential Prediction While Deep Learning the Kernel 2020

Nested-Wasserstein Self-Imitation Learning for Sequence Generation 2020

Unconditional Coresets for Regularized Loss Minimization 2020