Self-Predictive Dynamics for Generalization of Vision-based Reinforcement Learning

Kyungsoo Kim; Jeongsoo Ha; Yusung Kim

2022 IJCAI IJCAI 2022

Self-Predictive Dynamics for Generalization of Vision-based Reinforcement Learning

Abstract

Vision-based reinforcement learning requires efficient and robust representations of image-based observations, especially when the images contain distracting (task-irrelevant) elements such as shadows, clouds, and light. It becomes more important if those distractions are not exposed during training. We design a Self-Predictive Dynamics (SPD) method to extract task-relevant features efficiently, even in unseen observations after training. SPD uses weak and strong augmentations in parallel, and learns representations by predicting inverse and forward transitions across the two-way augmented versions. In a set of MuJoCo visual control tasks and an autonomous driving task (CARLA), SPD outperforms previous studies in complex observations, and significantly improves the generalization performance for unseen observations. Our code is available at https://github.com/unigary/SPD.

🌉 Interdisciplinary Bridge — Machine Learning and Reinforcement Learning

🧭 Keyword Pioneer — self-predictive dynamics

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Kyungsoo Kim , Jeongsoo Ha , Yusung Kim

Topics

Machine Learning > Learning Types > Self-Supervised Learning Reinforcement Learning > Applications > Robotics

Keywords

representation learning visual control vision-based reinforcement learning self-predictive dynamics

Download PDF

Related papers

Better Collective Decisions via Uncertainty Reduction 2022

Mixed Strategies for Security Games with General Defending Requirements 2022

Achieving Envy-Freeness with Limited Subsidies under Dichotomous Valuations 2022

Distortion in Voting with Top-t Preferences 2022

Let’s Agree to Agree: Targeting Consensus for Incomplete Preferences through Majority Dynamics 2022