2019
UAI
UAI 2019
Approximate Relative Value Learning for Average-reward Continuous State MDPs
Abstract
In this paper, we propose an approximate relative value learning (ARVL) algorithm for non- parametric MDPs with continuous state space and finite actions and average reward criterion. It is a sampling based algorithm combined with kernel density estimation and function approximation via nearest neighbors. The theoretical analysis is done via a random contraction operator framework and stochastic dominance argument. This is the first such algorithm for continuous state space MDPs with average re- ward criteria with these provable properties which does not require any discretization of state space as far as we know. We then evaluate the proposed algorithm on a benchmark problem numerically.
🚀
Conference Pioneer
— UAI 2019
🧭
Keyword Pioneer
— average reward mdp
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics
🌉
Interdisciplinary Bridge
— Machine Learning and Mathematics & Optimization and Reinforcement Learning
🐣
Hot Topic Early Bird
— value iteration