Cost-Sensitive Exploration in Bayesian Reinforcement Learning

Dongho Kim; Kee-eung Kim; Pascal Poupart

2012 NIPS NeurIPS 2012

Cost-Sensitive Exploration in Bayesian Reinforcement Learning

Abstract

In this paper, we consider Bayesian reinforcement learning (BRL) where actions incur costs in addition to rewards, and thus exploration has to be constrained in terms of the expected total cost while learning to maximize the expected long-term total reward. In order to formalize cost-sensitive exploration, we use the constrained Markov decision process (CMDP) as the model of the environment, in which we can naturally encode exploration requirements using the cost function. We extend BEETLE, a model-based BRL method, for learning in the environment with cost constraints. We demonstrate the cost-sensitive exploration behaviour in a number of simulated problems.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Reinforcement Learning

📈 Trend Setter — Offline RL

🧭 Keyword Pioneer — cost-sensitive exploration

🐝 Cross-Pollinator — Artificial Intelligence, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Reinforcement Learning, Robotics

🐣 Hot Topic Early Bird — model-based reinforcement learning

Authors

Dongho Kim , Kee-eung Kim , Pascal Poupart

Topics

Artificial Intelligence > Bayesian & Probabilistic > Bayesian Learning Machine Learning > Optimization & Theory > Bayesian Inference Reinforcement Learning > Methods > Deep RL Reinforcement Learning > Methods > Offline RL Machine Learning > Bayesian & Probabilistic > Bayesian Learning Machine Learning > Learning Types > Reinforcement Learning Artificial Intelligence > Core AI > Reinforcement Learning Machine Learning > Learning Types > Exploration-Exploitation

Keywords

bayesian reinforcement learning model-based reinforcement learning cost-sensitive exploration constrained markov decision process model-based brl exploration constraints exploration policy expected total cost

Download PDF

Related papers

Kernel Hyperalignment 2012

Fused sparsity and robust estimation for linear models with unknown variance 2012

Slice sampling normalized kernel-weighted completely random measure mixture models 2012

Scaling MPE Inference for Constrained Continuous Markov Random Fields with Consensus Optimization 2012

Matrix reconstruction with the local max norm 2012