PAC-inspired Option Discovery in Lifelong Reinforcement Learning

Emma Brunskill; Lihong Li

2014 ICML ICML 2014

PAC-inspired Option Discovery in Lifelong Reinforcement Learning

Abstract

A key goal of AI is to create lifelong learning agents that can leverage prior experience to improve performance on later tasks. In reinforcement-learning problems, one way to summarize prior experience for future use is through options, which are temporally extended actions (subpolicies) for how to behave. Options can then be used to potentially accelerate learning in new reinforcement learning tasks. In this work, we provide the first formal analysis of the sample complexity, a measure of learning speed, of reinforcement learning with options. This analysis helps shed light on some interesting prior empirical results on when and how options may accelerate learning. We then quantify the benefit of options in reducing sample complexity of a lifelong learning agent. Finally, the new theoretical insights inspire a novel option-discovery algorithm that aims at minimizing overall sample complexity in lifelong reinforcement learning.

🧭 Keyword Pioneer — option discovery

🐣 Hot Topic Early Bird — sample complexity

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

🌉 Interdisciplinary Bridge — Machine Learning and Reinforcement Learning

📈 Trend Setter — Multi-Agent Systems

Authors

Emma Brunskill , Lihong Li

Topics

Machine Learning > Optimization & Theory > Learning Theory Reinforcement Learning > Methods > Policy Learning Reinforcement Learning > Methods > Multi-Agent Systems Machine Learning > Learning Paradigms > Meta-Learning

Keywords

reinforcement learning sample complexity pac learning temporal abstraction lifelong learning option discovery

Download PDF

Related papers

Demystifying Information-Theoretic Clustering 2014

Margins, Kernels and Non-linear Smoothed Perceptrons 2014

Large-Margin Metric Learning for Constrained Partitioning Problems 2014

Efficient Approximation of Cross-Validation for Kernel Methods using Bouligand Influence Function 2014

Generalized Exponential Concentration Inequality for Renyi Divergence Estimation 2014