Nonparametric Bayesian Policy Priors for Reinforcement Learning

Finale Doshi-velez; David Wingate; Nicholas Roy; Joshua B. Tenenbaum

2010 NIPS NeurIPS 2010

Nonparametric Bayesian Policy Priors for Reinforcement Learning

Abstract

We consider reinforcement learning in partially observable domains where the agent can query an expert for demonstrations. Our nonparametric Bayesian approach combines model knowledge, inferred from expert information and independent exploration, with policy knowledge inferred from expert trajectories. We introduce priors that bias the agent towards models with both simple representations and simple policies, resulting in improved policy and model learning.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Reinforcement Learning

📈 Trend Setter — Self-Supervised Learning

🧭 Keyword Pioneer — partially observable domains

🐣 Hot Topic Early Bird — reinforcement learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics

Authors

Finale Doshi-velez , David Wingate , Nicholas Roy , Joshua B. Tenenbaum

Topics

Artificial Intelligence > Bayesian & Probabilistic > Bayesian Learning Machine Learning > Learning Types > Self-Supervised Learning Reinforcement Learning > Methods > Deep RL Reinforcement Learning > Methods > Policy Learning Machine Learning > Bayesian & Probabilistic > Bayesian Learning

Keywords

reinforcement learning nonparametric bayesian partially observable domains partially observable expert demonstration model learning policy prior

Download PDF

Related papers

Link Discovery using Graph Feature Tracking 2010

Trading off Mistakes and Don't-Know Predictions 2010

A Novel Kernel for Learning a Neuron Model from Spike Train Data 2010

Decomposing Isotonic Regression for Efficiently Solving Large Problems 2010

Learning Kernels with Radiuses of Minimum Enclosing Balls 2010