2010 NIPS NeurIPS 2010

Interval Estimation for Reinforcement-Learning Algorithms in Continuous-State Domains

Abstract

The reinforcement learning community has explored many approaches to obtain- ing value estimates and models to guide decision making; these approaches, how- ever, do not usually provide a measure of confidence in the estimate. Accurate estimates of an agent’s confidence are useful for many applications, such as bi- asing exploration and automatically adjusting parameters to reduce dependence on parameter-tuning. Computing confidence intervals on reinforcement learning value estimates, however, is challenging because data generated by the agent- environment interaction rarely satisfies traditional assumptions. Samples of value- estimates are dependent, likely non-normally distributed and often limited, partic- ularly in early learning when confidence estimates are pivotal. In this work, we investigate how to compute robust confidences for value estimates in continuous Markov decision processes. We illustrate how to use bootstrapping to compute confidence intervals online under a changing policy (previously not possible) and prove validity under a few reasonable assumptions. We demonstrate the applica- bility of our confidence estimation algorithms with experiments on exploration, parameter estimation and tracking.

πŸŒ‰ Interdisciplinary Bridge β€” Machine Learning and Reinforcement Learning
🧭 Keyword Pioneer β€” value estimation
🐣 Hot Topic Early Bird β€” reinforcement learning
🐝 Cross-Pollinator β€” Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics
🌱 Topic Pioneer β€” Uncertainty Quantification
πŸ“ˆ Trend Setter β€” Uncertainty Quantification