#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning

Haoran Tang; Rein Houthooft; Davis Foote; Adam Stooke; OpenAI Xi Chen; Yan Duan; John Schulman; Pieter Abbeel; Filip De Turck; Filip DeTurck

2017 NIPS NeurIPS 2017

#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning

Abstract

Count-based exploration algorithms are known to perform near-optimally when used in conjunction with tabular reinforcement learning (RL) methods for solving small discrete Markov decision processes (MDPs). It is generally thought that count-based methods cannot be applied in high-dimensional state spaces, since most states will only occur once. Recent deep RL exploration strategies are able to deal with high-dimensional continuous state spaces through complex heuristics, often relying on optimism in the face of uncertainty or intrinsic motivation. In this work, we describe a surprising finding: a simple generalization of the classic count-based approach can reach near state-of-the-art performance on various high-dimensional and/or continuous deep RL benchmarks. States are mapped to hash codes, which allows to count their occurrences with a hash table. These counts are then used to compute a reward bonus according to the classic count-based exploration theory. We find that simple hash functions can achieve surprisingly good results on many challenging tasks. Furthermore, we show that a domain-dependent learned hash code may further improve these results. Detailed analysis reveals important aspects of a good hash function: 1) having appropriate granularity and 2) encoding information relevant to solving the MDP. This exploration strategy achieves near state-of-the-art performance on both continuous control tasks and Atari 2600 games, hence providing a simple yet powerful baseline for solving MDPs that require considerable exploration.

🌉 Interdisciplinary Bridge — Deep Learning and Reinforcement Learning

🧭 Keyword Pioneer — tabular reinforcement learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy

Authors

Haoran Tang , Rein Houthooft , Davis Foote , Adam Stooke , OpenAI Xi Chen , Yan Duan , John Schulman , Filip De Turck , Filip DeTurck , Pieter Abbeel

Topics

Reinforcement Learning > Methods > Deep RL Deep Learning > Learning Types > Reinforcement Learning

Keywords

deep reinforcement learning state space exploration intrinsic motivation tabular reinforcement learning exploration bonus hash table hash function count-based exploration optimism in the face of uncertainty

Download PDF

Related papers

High-Order Attention Models for Visual Question Answering 2017

Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Optimization 2017

Premise Selection for Theorem Proving by Deep Graph Embedding 2017

Neural Program Meta-Induction 2017

Safe and Nested Subgame Solving for Imperfect-Information Games 2017