End-to-End Training of Deep Visuomotor Policies

Sergey Levine; Chelsea Finn; Trevor Darrell; Pieter Abbeel

2016 JMLR JMLR 2016

End-to-End Training of Deep Visuomotor Policies

Abstract

Policy search methods can allow robots to learn control policies for a wide range of tasks, but practical applications of policy search often require hand-engineered components for perception, state estimation, and low-level control. In this paper, we aim to answer the following question: does training the perception and control systems jointly end-to-end provide better performance than training each component separately? To this end, we develop a method that can be used to learn policies that map raw image observations directly to torques at the robot's motors. The policies are represented by deep convolutional neural networks (CNNs) with 92,000 parameters, and are trained using a guided policy search method, which transforms policy search into supervised learning, with supervision provided by a simple trajectory-centric reinforcement learning method. We evaluate our method on a range of real-world manipulation tasks that require close coordination between vision and control, such as screwing a cap onto a bottle, and present simulated comparisons to a range of prior policy search methods. [abs] [ pdf ][ bib ] © JMLR 2016. (edit, beta)

🌉 Interdisciplinary Bridge — Machine Learning and Reinforcement Learning

🧭 Keyword Pioneer — visuomotor policy

🐣 Hot Topic Early Bird — robot manipulation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Sergey Levine , Chelsea Finn , Trevor Darrell , Pieter Abbeel

Topics

Machine Learning > Optimization & Theory > Neural Network Optimization Reinforcement Learning > Applications > Robotics

Keywords

robot manipulation deep convolutional neural network guided policy search visuomotor policy

Download PDF

Related papers

Trend Filtering on Graphs 2016

Causal Inference through a Witness Protection Program 2016

A Characterization of Linkage-Based Hierarchical Clustering 2016

How to Center Deep Boltzmann Machines 2016

Minimax Rates in Permutation Estimation for Feature Matching 2016