PLEX: Making the Most of the Available Data for Robotic Manipulation Pretraining

Garrett Thomas; Ching-An Cheng; Ricky Loynd; Felipe Vieira Frujeri; Vibhav Vineet; Mihai Jalobeanu; Andrey Kolobov

2023 CORL CoRL 2023

PLEX: Making the Most of the Available Data for Robotic Manipulation Pretraining

Abstract

A rich representation is key to general robotic manipulation, but existing approaches to representation learning require large amounts of multimodal demonstrations. In this work we propose PLEX, a transformer-based architecture that learns from a small amount of task-agnostic visuomotor trajectories and a much larger amount of task-conditioned object manipulation videos – a type of data available in quantity. PLEX uses visuomotor trajectories to induce a latent feature space and to learn task-agnostic manipulation routines, while diverse video-only demonstrations teach PLEX how to plan in the induced latent feature space for a wide variety of tasks. Experiments showcase PLEX’s generalization on Meta-World and SOTA performance in challenging Robosuite environments. In particular, using relative positional encoding in PLEX’s transformers greatly helps in low-data regimes of learning from human-collected demonstrations.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning

🧭 Keyword Pioneer — visuomotor trajectory

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Garrett Thomas , Ching-An Cheng , Ricky Loynd , Felipe Vieira Frujeri , Vibhav Vineet , Mihai Jalobeanu , Andrey Kolobov

Topics

Artificial Intelligence > Learning Paradigms > Transfer Learning Deep Learning > Architectures > Transformers

Keywords

transformer architecture representation learning few-shot learning robotic manipulation transfer learning visuomotor trajectory

Download PDF

Related papers

Stochastic Occupancy Grid Map Prediction in Dynamic Scenes 2023

SayPlan: Grounding Large Language Models using 3D Scene Graphs for Scalable Robot Task Planning 2023

Robot Parkour Learning 2023

Task-Oriented Koopman-Based Control with Contrastive Encoder 2023

Language-Guided Traffic Simulation via Scene-Level Diffusion 2023