Learning Visually Guided Latent Actions for Assistive Teleoperation

Siddharth Karamcheti; Albert J. Zhai; Dylan P. Losey; Dorsa Sadigh

2021 L4DC L4DC 2021

Learning Visually Guided Latent Actions for Assistive Teleoperation

Abstract

It is challenging for humans — particularly people living with physical disabilities — to control high-dimensional and dexterous robots. Prior work explores how robots can learn embedding functions that map a human’s low-dimensional inputs (e.g., via a joystick) to complex, high-dimensional robot actions for assistive teleoperation; unfortunately, there are many more high-dimensional actions than available low-dimensional inputs! To extract the correct action and maximally assist their human controller, robots must reason over their current context: for example, pressing a joystick right when interacting with a coffee cup indicates a different action than when interacting with food. In this work, we develop assistive robots that condition their latent embeddings on visual inputs. We explore a spectrum of plausible visual encoders and show that incorporating object detectors pretrained on a small amount of cheap and easy-to-collect structured data enables i) accurately and robustly recognizing the current context and ii) generalizing control embeddings to new objects and tasks. In user studies with a high-dimensional physical robot arm, participants leverage this approach to perform new tasks with unseen objects. Our results indicate that structured visual representations improves few-shot performance and is subjectively preferred by users.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Reinforcement Learning and Robotics

🧭 Keyword Pioneer — visual guided control

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Siddharth Karamcheti , Albert J. Zhai , Dylan P. Losey , Dorsa Sadigh

Topics

Artificial Intelligence > Core AI > Multimodal Learning Artificial Intelligence > Learning Paradigms > Few-Shot Learning Computer Vision > Analysis > Object Detection Reinforcement Learning > Applications > Robotics Robotics > Capabilities > Manipulation Artificial Intelligence > Core AI > Robotics

Keywords

few-shot learning object detection visual recognition latent embedding assistive robotics assistive robot latent action space visual guided control robot teleoperation

Download PDF

Related papers

Abstraction-based branch and bound approach to Q-learning for hybrid optimal control 2021

Data-driven design of switching reference governors for brake-by-wire applications 2021

Learning local modules in dynamic networks 2021

Certainty Equivalent Perception-Based Control 2021

Sample Complexity of Linear Quadratic Gaussian (LQG) Control for Output Feedback Systems 2021