Multi-policy Grounding and Ensemble Policy Learning for Transfer Learning with Dynamics Mismatch

Hyun-Rok Lee; Ram Ananth Sreenivasan; Yeonjeong Jeong; Jongseong Jang; Dongsub Shim; Chi-Guhn Lee

2022 IJCAI IJCAI 2022

Multi-policy Grounding and Ensemble Policy Learning for Transfer Learning with Dynamics Mismatch

Abstract

We propose a new transfer learning algorithm between tasks with different dynamics. The proposed algorithm solves an Imitation from Observation problem (IfO) to ground the source environment to the target task before learning an optimal policy in the grounded environment. The learned policy is deployed in the target task without additional training. A particular feature of our algorithm is the employment of multiple rollout policies during training with a goal to ground the environment more globally; hence, it is named as Multi-Policy Grounding (MPG). The quality of final policy is further enhanced via ensemble policy learning. We demonstrate the superiority of the proposed algorithm analytically and numerically. Numerical studies show that the proposed multi-policy approach allows comparable grounding with single policy approach with a fraction of target samples, hence the algorithm is able to maintain the quality of obtained policy even as the number of interactions with the target environment becomes extremely small.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Reinforcement Learning

🧭 Keyword Pioneer — multi-policy grounding

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Hyun-Rok Lee , Ram Ananth Sreenivasan , Yeonjeong Jeong , Jongseong Jang , Dongsub Shim , Chi-Guhn Lee

Topics

Artificial Intelligence > Learning Paradigms > Transfer Learning Reinforcement Learning > Methods > Policy Learning

Keywords

transfer learning dynamics mismatch imitation from observation multi-policy grounding ensemble policy learning

Download PDF

Related papers

Better Collective Decisions via Uncertainty Reduction 2022

Mixed Strategies for Security Games with General Defending Requirements 2022

Achieving Envy-Freeness with Limited Subsidies under Dichotomous Valuations 2022

Distortion in Voting with Top-t Preferences 2022

Let’s Agree to Agree: Targeting Consensus for Incomplete Preferences through Majority Dynamics 2022