Nonparametric Bayesian Inverse Reinforcement Learning for Multiple Reward Functions

Jaedeug Choi; Kee-eung Kim

2012 NIPS NeurIPS 2012

Nonparametric Bayesian Inverse Reinforcement Learning for Multiple Reward Functions

Abstract

We present a nonparametric Bayesian approach to inverse reinforcement learning (IRL) for multiple reward functions. Most previous IRL algorithms assume that the behaviour data is obtained from an agent who is optimizing a single reward function, but this assumption is hard to be met in practice. Our approach is based on integrating the Dirichlet process mixture model into Bayesian IRL. We provide an efficient Metropolis-Hastings sampling algorithm utilizing the gradient of the posterior to estimate the underlying reward functions, and demonstrate that our approach outperforms the previous ones via experiments on a number of problem domains.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Reinforcement Learning

🧭 Keyword Pioneer — multiple reward functions

🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics

🐣 Hot Topic Early Bird — inverse reinforcement learning

Authors

Jaedeug Choi , Kee-eung Kim

Topics

Artificial Intelligence > Bayesian & Probabilistic > Bayesian Learning Reinforcement Learning > Methods > Policy Learning Machine Learning > Bayesian & Probabilistic > Bayesian Learning Machine Learning > Core Methods > Probabilistic Modeling Machine Learning > Learning Types > Reinforcement Learning

Keywords

dirichlet process nonparametric bayesian bayesian learning bayesian inference metropolis-hastings sampling inverse reinforcement learning reward function bayesian inverse reinforcement learning reward function estimation multiple reward functions

Download PDF

Related papers

Kernel Hyperalignment 2012

Fused sparsity and robust estimation for linear models with unknown variance 2012

Slice sampling normalized kernel-weighted completely random measure mixture models 2012

Scaling MPE Inference for Constrained Continuous Markov Random Fields with Consensus Optimization 2012

Matrix reconstruction with the local max norm 2012