SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies

Linxi Fan; Guanzhi Wang; De-An Huang; Zhiding Yu; Li Fei-fei; Yuke Zhu; Animashree Anandkumar

2021 ICML ICML 2021

SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies

Abstract

Generalization has been a long-standing challenge for reinforcement learning (RL). Visual RL, in particular, can be easily distracted by irrelevant factors in high-dimensional observation space. In this work, we consider robust policy learning which targets zero-shot generalization to unseen visual environments with large distributional shift. We propose SECANT, a novel self-expert cloning technique that leverages image augmentation in two stages to *decouple* robust representation learning from policy optimization. Specifically, an expert policy is first trained by RL from scratch with weak augmentations. A student network then learns to mimic the expert policy by supervised learning with strong augmentations, making its representation more robust against visual variations compared to the expert. Extensive experiments demonstrate that SECANT significantly advances the state of the art in zero-shot generalization across 4 challenging domains. Our average reward improvements over prior SOTAs are: DeepMind Control (+26.5%), robotic manipulation (+337.8%), vision-based autonomous driving (+47.7%), and indoor object navigation (+15.8%). Code release and video are available at https://linxifan.github.io/secant-site/.

🌉 Interdisciplinary Bridge — Machine Learning and Reinforcement Learning

🧭 Keyword Pioneer — self-expert cloning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

🐣 Hot Topic Early Bird — zero-shot generalization

Authors

Linxi Fan , Guanzhi Wang , De-An Huang , Zhiding Yu , Li Fei-fei , Yuke Zhu , Animashree Anandkumar

Topics

Machine Learning > Learning Types > Self-Supervised Learning Machine Learning > Learning Types > Zero-Shot Learning Machine Learning > Application Areas > Domain Generalization Computer Vision > Domain-Specific > Autonomous Driving Reinforcement Learning > Applications > Robotics Artificial Intelligence > Core AI > Robotics Machine Learning > Learning Paradigms > Zero-Shot Learning

Keywords

reinforcement learning robotic manipulation domain generalization supervised learning zero-shot generalization self-expert cloning visual policy image augmentation policy cloning

Download PDF

Related papers

GRAND: Graph Neural Diffusion 2021

Almost Optimal Anytime Algorithm for Batched Multi-Armed Bandits 2021

Straight to the Gradient: Learning to Use Novel Tokens for Neural Text Generation 2021

Differentiable Dynamic Quantization with Mixed Precision and Adaptive Resolution 2021

Dataset Dynamics via Gradient Flows in Probability Space 2021