2018 ICML ICML 2018

Policy and Value Transfer in Lifelong Reinforcement Learning

Abstract

We consider the problem of how best to use prior experience to bootstrap lifelong learning, where an agent faces a series of task instances drawn from some task distribution. First, we identify the initial policy that optimizes expected performance over the distribution of tasks for increasingly complex classes of policy and task distributions. We empirically demonstrate the relative performance of each policy class’ optimal element in a variety of simple task distributions. We then consider value-function initialization methods that preserve PAC guarantees while simultaneously minimizing the learning required in two learning algorithms, yielding MaxQInit, a practical new method for value-function-based transfer. We show that MaxQInit performs well in simple lifelong RL experiments.

πŸŒ‰ Interdisciplinary Bridge β€” Artificial Intelligence and Machine Learning and Reinforcement Learning
🧭 Keyword Pioneer β€” bootstrap learning
🐝 Cross-Pollinator β€” Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio
πŸ“ˆ Trend Setter β€” Continual Learning
🐣 Hot Topic Early Bird β€” pac learning