StructDiffusion: Language-Guided Creation of Physically-Valid Structures using Unseen Objects

Weiyu Liu; Yilun Du; Tucker Hermans; Sonia Chernova; Chris Paxton

2023 RSS RSS 2023

StructDiffusion: Language-Guided Creation of Physically-Valid Structures using Unseen Objects

Abstract

Robots operating in human environments must be able to rearrange objects into semantically-meaningful configurations, even if these objects are previously unseen. In this work, we focus on the problem of building physically-valid structures without step-by-step instructions. We propose StructDiffusion, which combines a diffusion model and an object-centric transformer to construct structures given partial-view point clouds and high-level language goals, such as "set the table". Our method can perform multiple challenging language-conditioned multi-step 3D planning tasks using one model. StructDiffusion even improves the success rate of assembling physically-valid structures out of unseen objects by on average 16% over an existing multi-modal transformer model trained on specific structures. We show experiments on held-out objects in both simulation and on real-world rearrangement tasks. Importantly, we show how integrating both a diffusion model and a collision-discriminator model allows for improved generalization over other methods when rearranging previously-unseen objects.

🧭 Keyword Pioneer — structure assembly

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Weiyu Liu , Yilun Du , Tucker Hermans , Sonia Chernova , Chris Paxton

Topics

Artificial Intelligence > Core AI > Multimodal Learning Artificial Intelligence > Core AI > Planning Artificial Intelligence > Learning Paradigms > Zero-Shot Learning

Keywords

zero-shot learning multi-step planning diffusion model language guidance structure assembly

Download PDF

Related papers

FurnitureBench: Reproducible Real-World Benchmark for Long-Horizon Complex Manipulation 2023

Uncertain Pose Estimation during Contact Tasks using Differentiable Contact Features 2023

Follow my Advice: Assume-Guarantee Approach to Task Planning with Human in the Loop 2023

Centralized Model Predictive Control for Collaborative Loco-Manipulation 2023

Robotic Table Tennis: A Case Study into a High Speed Learning System 2023