The Value Function Polytope in Reinforcement Learning

Robert Dadashi; Adrien Ali Taiga; Nicolas Le Roux; Dale Schuurmans; Marc G. Bellemare

2019 ICML ICML 2019

The Value Function Polytope in Reinforcement Learning

Abstract

We establish geometric and topological properties of the space of value functions in finite state-action Markov decision processes. Our main contribution is the characterization of the nature of its shape: a general polytope (Aigner et al., 2010). To demonstrate this result, we exhibit several properties of the structural relationship between policies and value functions including the line theorem, which shows that the value functions of policies constrained on all but one state describe a line segment. Finally, we use this novel perspective and introduce visualizations to enhance the understanding of the dynamics of reinforcement learning algorithms.

🌉 Interdisciplinary Bridge — Machine Learning and Reinforcement Learning

🧭 Keyword Pioneer — value function polytope

🐝 Cross-Pollinator — Artificial Intelligence, Deep Learning, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Reinforcement Learning, Robotics

Authors

Robert Dadashi , Adrien Ali Taiga , Nicolas Le Roux , Dale Schuurmans , Marc G. Bellemare

Topics

Machine Learning > Optimization & Theory > Theory Reinforcement Learning > Methods > Policy Learning Reinforcement Learning > Applications > Value Iteration Mathematics & Optimization > Mathematics > Geometry

Keywords

policy optimization geometric analysis markov decision process value function policy analysis value function polytope topological property linear interpolation

Download PDF

Related papers

Bayesian leave-one-out cross-validation for large data 2019

A Block Coordinate Descent Proximal Method for Simultaneous Filtering and Parameter Estimation 2019

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks 2019

Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously 2019

Improved Convergence for $\ell_1$ and $\ell_∞$ Regression via Iteratively Reweighted Least Squares 2019