Bootstrapping Apprenticeship Learning

Abdeslam Boularias; Brahim Chaib-draa

2010 NIPS NeurIPS 2010

Bootstrapping Apprenticeship Learning

Abstract

We consider the problem of apprenticeship learning where the examples, demonstrated by an expert, cover only a small part of a large state space. Inverse Reinforcement Learning (IRL) provides an efficient tool for generalizing the demonstration, based on the assumption that the expert is maximizing a utility function that is a linear combination of state-action features. Most IRL algorithms use a simple Monte Carlo estimation to approximate the expected feature counts under the expert's policy. In this paper, we show that the quality of the learned policies is highly sensitive to the error in estimating the feature counts. To reduce this error, we introduce a novel approach for bootstrapping the demonstration by assuming that: (i), the expert is (near-)optimal, and (ii), the dynamics of the system is known. Empirical results on gridworlds and car racing problems show that our approach is able to learn good policies from a small number of demonstrations.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Reinforcement Learning

📈 Trend Setter — Agent Systems

🧭 Keyword Pioneer — inverse reinforcement learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics

🐣 Hot Topic Early Bird — imitation learning

Authors

Abdeslam Boularias , Brahim Chaib-draa

Topics

Artificial Intelligence > Core AI > Agent Systems Artificial Intelligence > Core AI > Planning Reinforcement Learning > Methods > Policy Learning Reinforcement Learning > Applications > Robotics Machine Learning > Learning Types > Reinforcement Learning Machine Learning > Learning Types > Imitation Learning Artificial Intelligence > Core AI > Reinforcement Learning

Keywords

imitation learning feature matching policy learning inverse reinforcement learning monte carlo estimation apprenticeship learning feature counts bootstrapping feature-based reward utility function expert demonstration

Download PDF

Related papers

Link Discovery using Graph Feature Tracking 2010

Trading off Mistakes and Don't-Know Predictions 2010

A Novel Kernel for Learning a Neuron Model from Spike Train Data 2010

Decomposing Isotonic Regression for Efficiently Solving Large Problems 2010

Learning Kernels with Radiuses of Minimum Enclosing Balls 2010