SORNet: Spatial Object-Centric Representations for Sequential Manipulation

Wentao Yuan; Chris Paxton; Karthik Desingh; Dieter Fox

2021 CORL CoRL 2021

SORNet: Spatial Object-Centric Representations for Sequential Manipulation

Abstract

Sequential manipulation tasks require a robot to perceive the state of an environment and plan a sequence of actions leading to a desired goal state, where the ability to reason about spatial relationships among object entities from raw sensor inputs is crucial. Prior works relying on explicit state estimation or end-to-end learning struggle with novel objects or new tasks. In this work, we propose SORNet (Spatial Object-Centric Representation Network), which extracts object-centric representations from RGB images conditioned on canonical views of the objects of interest. We show that the object embeddings learned by SORNet generalize zero-shot to unseen object entities on three spatial reasoning tasks: spatial relationship classification, skill precondition classification and relative direction regression, significantly outperforming baselines. Further, we present real-world robotic experiments demonstrating the usage of the learned object embeddings in task planning for sequential manipulation.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🐣 Hot Topic Early Bird — task planning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Wentao Yuan , Chris Paxton , Karthik Desingh , Dieter Fox

Topics

Artificial Intelligence > Core AI > Multimodal Learning Machine Learning > Core Methods > Representation Learning Machine Learning > Learning Types > Zero-Shot Learning

Keywords

zero-shot learning task planning object-centric representation spatial reasoning canonical view sequential manipulation

Download PDF

Related papers

FlingBot: The Unreasonable Effectiveness of Dynamic Manipulation for Cloth Unfolding 2021

TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view Stereo 2021

Taskography: Evaluating robot task planning over large 3D scene graphs 2021

Parallelised Diffeomorphic Sampling-based Motion Planning 2021

Learning to Walk in Minutes Using Massively Parallel Deep Reinforcement Learning 2021