TACO: Learning Task Decomposition via Temporal Alignment for Control

Kyriacos Shiarlis; Markus Wulfmeier; Sasha Salter; Shimon Whiteson; Ingmar Posner

2018 ICML ICML 2018

TACO: Learning Task Decomposition via Temporal Alignment for Control

Abstract

Many advanced Learning from Demonstration (LfD) methods consider the decomposition of complex, real-world tasks into simpler sub-tasks. By reusing the corresponding sub-policies within and between tasks, we can provide training data for each policy from different high-level tasks and compose them to perform novel ones. Existing approaches to modular LfD focus either on learning a single high-level task or depend on domain knowledge and temporal segmentation. In contrast, we propose a weakly supervised, domain-agnostic approach based on task sketches, which include only the sequence of sub-tasks performed in each demonstration. Our approach simultaneously aligns the sketches with the observed demonstrations and learns the required sub-policies. This improves generalisation in comparison to separate optimisation procedures. We evaluate the approach on multiple domains, including a simulated 3D robot arm control task using purely image-based observations. The results show that our approach performs commensurately with fully supervised approaches, while requiring significantly less annotation effort.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🧭 Keyword Pioneer — sub-policy learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Data Science & Analytics, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics

🐣 Hot Topic Early Bird — temporal alignment

Authors

Kyriacos Shiarlis , Markus Wulfmeier , Sasha Salter , Shimon Whiteson , Ingmar Posner

Topics

Artificial Intelligence > Core AI > Planning Machine Learning > Learning Types > Weakly Supervised Learning Reinforcement Learning > Applications > Robotics Machine Learning > Learning Types > Imitation Learning

Keywords

imitation learning robot control temporal alignment learning from demonstration task decomposition sub-policy learning robot arm control

Download PDF

Related papers

Rectify Heterogeneous Models with Semantic Mapping 2018

Bayesian Optimization of Combinatorial Structures 2018

The Well-Tempered Lasso 2018

Approximation Algorithms for Cascading Prediction Models 2018

Classification from Pairwise Similarity and Unlabeled Data 2018