Adversarial Skill Chaining for Long-Horizon Robot Manipulation via Terminal State Regularization

Youngwoon Lee; Anima Anandkumar; Yuke Zhu; Joseph J. Lim; Joseph J Lim

2021 CORL CoRL 2021

Adversarial Skill Chaining for Long-Horizon Robot Manipulation via Terminal State Regularization

Abstract

Skill chaining is a promising approach for synthesizing complex behaviors by sequentially combining previously learned skills. Yet, a naive composition of skills fails when a policy encounters a starting state never seen during its training. For successful skill chaining, prior approaches attempt to widen the policy’s starting state distribution. However, these approaches require larger state distributions to be covered as more policies are sequenced, and thus are limited to short skill sequences. In this paper, we propose to chain multiple policies without excessively large initial state distributions by regularizing the terminal state distributions in an adversarial learning framework. We evaluate our approach on two complex long-horizon manipulation tasks of furniture assembly. Our results have shown that our method establishes the first model-free reinforcement learning algorithm to solve these tasks; whereas prior skill chaining approaches fail. The code and videos are available at https://clvrai.com/skill-chaining.

🌉 Interdisciplinary Bridge — Machine Learning and Reinforcement Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Youngwoon Lee , Joseph J. Lim , Joseph J Lim , Anima Anandkumar , Yuke Zhu

Topics

Machine Learning > Learning Types > Adversarial Learning Reinforcement Learning > Applications > Robotics

Keywords

adversarial learning skill chaining robot manipulation long-horizon planning state regularization

Download PDF

Related papers

FlingBot: The Unreasonable Effectiveness of Dynamic Manipulation for Cloth Unfolding 2021

TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view Stereo 2021

Taskography: Evaluating robot task planning over large 3D scene graphs 2021

Parallelised Diffeomorphic Sampling-based Motion Planning 2021

Learning to Walk in Minutes Using Massively Parallel Deep Reinforcement Learning 2021