Contrasting Exploration in Parameter and Action Space: A Zeroth-Order Optimization Perspective

Anirudh Vemula; Wen Sun; J. Bagnell

2019 AISTATS AISTATS 2019

Contrasting Exploration in Parameter and Action Space: A Zeroth-Order Optimization Perspective

Abstract

Black-box optimizers that explore in parameter space have often been shown to outperform more sophisticated action space exploration methods developed specifically for the reinforcement learning problem. We examine these black-box methods closely to identify situations in which they are worse than action space exploration methods and those in which they are superior. Through simple theoretical analyses, we prove that complexity of exploration in parameter space depends on the dimensionality of parameter space, while complexity of exploration in action space depends on both the dimensionality of action space and horizon length. This is also demonstrated empirically by comparing simple exploration methods on several model problems, including Contextual Bandit, Linear Regression and Reinforcement Learning in continuous control.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Mathematics & Optimization and Reinforcement Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Anirudh Vemula , Wen Sun , J. Bagnell

Topics

Artificial Intelligence > Core AI > Agent Systems Machine Learning > Optimization & Theory > Optimization Reinforcement Learning > Methods > Deep RL Machine Learning > Learning Types > Reinforcement Learning Mathematics & Optimization > Optimization > Optimization

Keywords

reinforcement learning black-box optimization continuous control zeroth-order optimization action space parameter space

Download PDF

Related papers

Inferring Multidimensional Rates of Aging from Cross-Sectional Data 2019

On the Interaction Effects Between Prediction and Clustering 2019

Efficient Linear Bandits through Matrix Sketching 2019

An Optimal Algorithm for Stochastic Three-Composite Optimization 2019

Efficient Inference in Multi-task Cox Process Models 2019