Online Multi-Task Learning for Policy Gradient Methods

Haitham Bou Ammar; Eric Eaton; Paul Ruvolo; Matthew Taylor

2014 ICML ICML 2014

Online Multi-Task Learning for Policy Gradient Methods

Abstract

Policy gradient algorithms have shown considerable recent success in solving high-dimensional sequential decision making tasks, particularly in robotics. However, these methods often require extensive experience in a domain to achieve high performance. To make agents more sample-efficient, we developed a multi-task policy gradient method to learn decision making tasks consecutively, transferring knowledge between tasks to accelerate learning. Our approach provides robust theoretical guarantees, and we show empirically that it dramatically accelerates learning on a variety of dynamical systems, including an application to quadrotor control.

🌉 Interdisciplinary Bridge — Machine Learning and Reinforcement Learning

📈 Trend Setter — Transfer Learning

🧭 Keyword Pioneer — quadrotor control

🐣 Hot Topic Early Bird — multi-task learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors

Haitham Bou Ammar , Eric Eaton , Paul Ruvolo , Matthew Taylor

Topics

Reinforcement Learning > Methods > Deep RL Reinforcement Learning > Methods > Policy Learning Reinforcement Learning > Applications > Robotics Robotics Machine Learning > Learning Types > Multi-Task Learning Machine Learning > Learning Types > Transfer Learning Computer Vision > Domain-Specific > Robotics Robotics > Applications > Robotics

Keywords

reinforcement learning sample efficiency multi-task learning policy gradient sequential decision making knowledge transfer quadrotor control

Download PDF

Related papers

Demystifying Information-Theoretic Clustering 2014

Margins, Kernels and Non-linear Smoothed Perceptrons 2014

Large-Margin Metric Learning for Constrained Partitioning Problems 2014

Efficient Approximation of Cross-Validation for Kernel Methods using Bouligand Influence Function 2014

Generalized Exponential Concentration Inequality for Renyi Divergence Estimation 2014