Value Preserving State-Action Abstractions

David Abel; Nate Umbanhowar; Khimya Khetarpal; Dilip Arumugam; Doina Precup; Michael Littman

2020 AISTATS AISTATS 2020

Value Preserving State-Action Abstractions

Abstract

Abstraction can improve the sample efficiency of reinforcement learning. However, the process of abstraction inherently discards information, potentially compromising an agent’s ability to represent high-value policies. To mitigate this, we here introduce combinations of state abstractions and options that are guaranteed to preserve representation of near-optimal policies. We first define $\phi$-relative options, a general formalism for analyzing the value loss of options paired with a state abstraction, and present necessary and sufficient conditions for $\phi$-relative options to preserve near-optimal behavior in any finite Markov Decision Process. We further show that, under appropriate assumptions, $\phi$-relative options can be composed to induce hierarchical abstractions that are also guaranteed to represent high-value policies.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Reinforcement Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

David Abel , Nate Umbanhowar , Khimya Khetarpal , Dilip Arumugam , Doina Precup , Michael Littman

Topics

Artificial Intelligence > Core AI > Agent Systems Reinforcement Learning > Methods > Policy Learning Reinforcement Learning > Applications > Robotics Reinforcement Learning > Applications > Value Iteration

Keywords

reinforcement learning state abstraction hierarchical reinforcement learning markov decision process value function hierarchical abstraction

Download PDF

Related papers

Stretching the Effectiveness of MLE from Accuracy to Bias for Pairwise Comparisons 2020

Fast and Accurate Ranking Regression 2020

Nonparametric Sequential Prediction While Deep Learning the Kernel 2020

Nested-Wasserstein Self-Imitation Learning for Sequence Generation 2020

Unconditional Coresets for Regularized Loss Minimization 2020