Co-training for Policy Learning

Jialin Song; ravi lanka; Yisong Yue; Masahiro Ono

2019 UAI UAI 2019

Co-training for Policy Learning

Abstract

We study the problem of learning sequential decision-making policies in settings with multiple state-action representations. Such settings naturally arise in many domains, such as planning (e.g., multiple integer programming formulations) and various combinatorial optimization problems (e.g., those with both integer programming and graph-based formulations). Inspired by the classical co-training framework for classification, we study the problem of co-training for policy learning. We present sufficient conditions under which learning from two views can improve upon learning from a single view alone. Motivated by these theoretical insights, we present a meta-algorithm for co-training for sequential decision making. Our framework is compatible with both reinforcement learning and imitation learning. We validate the effectiveness of our approach across a wide range of tasks, including discrete/continuous control and combinatorial optimization.

🚀 Conference Pioneer — UAI 2019

🌉 Interdisciplinary Bridge — Machine Learning and Reinforcement Learning

🐣 Hot Topic Early Bird — combinatorial optimization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics

Authors

Jialin Song , ravi lanka , Yisong Yue , Masahiro Ono

Topics

Artificial Intelligence > Core AI > Planning Reinforcement Learning > Methods > Deep RL Reinforcement Learning > Methods > Policy Learning Machine Learning > Learning Types > Multi-Task Learning Machine Learning > Learning Types > Transfer Learning Machine Learning > Learning Paradigms > Multi-Task Learning

Keywords

combinatorial optimization reinforcement learning sequential decision-making imitation learning sequential decision making policy learning

Download PDF

Related papers

Fisher-Bures Adversary Graph Convolutional Networks 2019

Augmenting and Tuning Knowledge Graph Embeddings 2019

Learning Factored Markov Decision Processes with Unawareness 2019

Expressive Priors in Bayesian Neural Networks: Kernel Combinations and Periodic Functions 2019

Countdown Regression: Sharp and Calibrated Survival Predictions 2019