Emergent Tangled Program Graphs in Multi-Task Learning

Stephen Kelly; Malcolm Heywood

2018 IJCAI IJCAI 2018

Emergent Tangled Program Graphs in Multi-Task Learning

Abstract

We propose a Genetic Programming (GP) framework to address high-dimensional Multi-Task Reinforcement Learning (MTRL) through emergent modularity. A bottom-up process is assumed in which multiple programs self-organize into collective decision-making entities, or teams, which then further develop into multi-team policy graphs, or Tangled Program Graphs (TPG). The framework learns to play three Atari video games simultaneously, producing a single control policy that matches or exceeds leading results from (game-specific) deep reinforcement learning in each game. More importantly, unlike the representation assumed for deep learning, TPG policies start simple and adaptively complexify through interaction with the task environment, resulting in agents that are exceedingly simple, operating in real-time without specialized hardware support such as GPUs.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Reinforcement Learning

🧭 Keyword Pioneer — genetic programming

🐣 Hot Topic Early Bird — multi-task learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors

Stephen Kelly , Malcolm Heywood

Topics

Artificial Intelligence > Core AI > Multi-Agent Systems Machine Learning > Learning Types > Continual Learning Reinforcement Learning > Applications > Game AI

Keywords

multi-task learning atari game genetic programming tangled program graph modular structure

Download PDF

Related papers

Semi-Supervised Multi-Modal Learning with Incomplete Modalities 2018

High-dimensional Similarity Learning via Dual-sparse Random Projection 2018

FISH-MML: Fisher-HSIC Multi-View Metric Learning 2018

Generative Warfare Nets: Ensemble via Adversaries and Collaborators 2018

Semi-Supervised Optimal Margin Distribution Machines 2018