Scaling All-Goals Updates in Reinforcement Learning Using Convolutional Neural Networks

Fabio Pardo; Vitaly Levdik; Petar Kormushev

2020 AAAI AAAI 2020

Scaling All-Goals Updates in Reinforcement Learning Using Convolutional Neural Networks

Abstract

Abstract Being able to reach any desired location in the environment can be a valuable asset for an agent. Learning a policy to navigate between all pairs of states individually is often not feasible. An all-goals updating algorithm uses each transition to learn Q-values towards all goals simultaneously and off-policy. However the expensive numerous updates in parallel limited the approach to small tabular cases so far. To tackle this problem we propose to use convolutional network architectures to generate Q-values and updates for a large number of goals at once. We demonstrate the accuracy and generalization qualities of the proposed method on randomly generated mazes and Sokoban puzzles. In the case of on-screen goal coordinates the resulting mapping from frames to distance-maps directly informs the agent about which places are reachable and in how many steps. As an example of application we show that replacing the random actions in ε-greedy exploration by several actions towards feasible goals generates better exploratory trajectories on Montezuma's Revenge and Super Mario All-Stars games.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Reinforcement Learning

🧭 Keyword Pioneer — all-goals update

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Fabio Pardo , Vitaly Levdik , Petar Kormushev

Topics

Machine Learning > Core Methods > Embedding Learning Deep Learning > Architectures > Neural Networks Reinforcement Learning Reinforcement Learning > Methods > Deep RL Reinforcement Learning > Applications > Game AI Machine Learning > Learning Types > Reinforcement Learning Deep Learning > Learning Types > Reinforcement Learning Deep Learning > Architectures > Convolutional Neural Networks

Keywords

reinforcement learning value iteration off-policy learning convolutional neural network goal-conditioned policy exploration strategy goal-conditioned reinforcement learning all-goals update all-goals updating

Download PDF

Related papers

Enhancing Pointer Network for Sentence Ordering with Pairwise Ordering Predictions 2020

CopyMTL: Copy Mechanism for Joint Extraction of Entities and Relations with Multi-Task Learning 2020

Neural Simile Recognition with Cyclic Multitask Learning and Local Attention 2020

Being Optimistic to Be Conservative: Quickly Learning a CVaR Policy 2020

Multi-Point Semantic Representation for Intent Classification 2020