Bayes-Adaptive Simulation-based Search with Value Function Approximation

Arthur Guez; Nicolas Heess; David Silver; Peter Dayan

2014 NIPS NeurIPS 2014

Bayes-Adaptive Simulation-based Search with Value Function Approximation

Abstract

Bayes-adaptive planning offers a principled solution to the exploration-exploitation trade-off under model uncertainty. It finds the optimal policy in belief space, which explicitly accounts for the expected effect on future rewards of reductions in uncertainty. However, the Bayes-adaptive solution is typically intractable in domains with large or continuous state spaces. We present a tractable method for approximating the Bayes-adaptive solution by combining simulation-based search with a novel value function approximation technique that generalises over belief space. Our method outperforms prior approaches in both discrete bandit tasks and simple continuous navigation and control tasks.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Reinforcement Learning

🐣 Hot Topic Early Bird — reinforcement learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

📈 Trend Setter — Optimal Control

🧭 Keyword Pioneer — simulation-based search

Authors

Arthur Guez , Nicolas Heess , David Silver , Peter Dayan

Topics

Artificial Intelligence > Core AI > Planning Artificial Intelligence > Bayesian & Probabilistic > Bayesian Learning Machine Learning > Optimization & Theory > Bayesian Inference Reinforcement Learning > Methods > Deep RL Reinforcement Learning > Methods > Policy Learning Machine Learning > Learning Types > Reinforcement Learning Machine Learning > Bayesian & Probabilistic > Bayesian Inference Mathematics & Optimization > Optimization > Optimal Control Machine Learning > Learning Types > Exploration-Exploitation

Keywords

reinforcement learning bayesian learning bayesian inference model uncertainty exploration exploitation belief space value function approximation bayesian optimization belief space planning simulation-based search bayes-adaptive planning

Download PDF

Related papers

Information-based learning by agents in unbounded state spaces 2014

Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm 2014

Partition-wise Linear Models 2014

Active Regression by Stratification 2014

Cone-Constrained Principal Component Analysis 2014