Deep Exploration via Randomized Value Functions

Ian Osband; Benjamin Van Roy; Daniel J. Russo; Zheng Wen

2019 JMLR JMLR 2019

Deep Exploration via Randomized Value Functions

Abstract

We study the use of randomized value functions to guide deep exploration in reinforcement learning. This offers an elegant means for synthesizing statistically and computationally efficient exploration with common practical approaches to value function learning. We present several reinforcement learning algorithms that leverage randomized value functions and demonstrate their efficacy through computational studies. We also prove a regret bound that establishes statistical efficiency with a tabular representation. [abs] [ pdf ][ bib ] © JMLR 2019. (edit, beta)

🌉 Interdisciplinary Bridge — Machine Learning and Reinforcement Learning

🧭 Keyword Pioneer — value function learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy

Authors

Ian Osband , Benjamin Van Roy , Daniel J. Russo , Zheng Wen

Topics

Machine Learning > Optimization & Theory > Stochastic Processes Reinforcement Learning > Methods > Deep RL

Keywords

deep reinforcement learning regret bound value function learning randomized value function

Download PDF

Related papers

Adaptation Based on Generalized Discrepancy 2019

Iterated Learning in Dynamic Social Networks 2019

Pyro: Deep Universal Probabilistic Programming 2019

Matched Bipartite Block Model with Covariates 2019

Approximation Hardness for A Class of Sparse Optimization Problems 2019