Online Linear Regression and Its Application to Model-Based Reinforcement Learning

Alexander L. Strehl; Michael L. Littman

2007 NIPS NeurIPS 2007

Online Linear Regression and Its Application to Model-Based Reinforcement Learning

Abstract

We provide a provably efficient algorithm for learning Markov Decision Processes (MDPs) with continuous state and action spaces in the online setting. Specifically, we take a model-based approach and show that a special type of online linear regression allows us to learn MDPs with (possibly kernalized) linearly parameterized dynamics. This result builds on Kearns and Singh's work that provides a provably efficient algorithm for finite state MDPs. Our approach is not restricted to the linear setting, and is applicable to other classes of continuous MDPs.

🌉 Interdisciplinary Bridge — Machine Learning and Reinforcement Learning

🧭 Keyword Pioneer — online linear regression

🐝 Cross-Pollinator — Artificial Intelligence, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Reinforcement Learning, Robotics

🌱 Topic Pioneer — Model-Based RL

🐣 Hot Topic Early Bird — markov decision process

Authors

Alexander L. Strehl , Michael L. Littman

Topics

Machine Learning > Core Methods > Regression Machine Learning > Optimization & Theory > Optimization Reinforcement Learning > Methods > Deep RL Machine Learning > Learning Types > Online Learning Machine Learning > Learning Types > Reinforcement Learning Machine Learning > Learning Types > Model-Based RL

Keywords

markov decision processes markov decision process online linear regression model-based reinforcement learning kernelized dynamics dynamic system continuous state space

Download PDF

Related papers

Exponential Family Predictive Representations of State 2007

Privacy-Preserving Belief Propagation and Sampling 2007

Efficient Principled Learning of Thin Junction Trees 2007

How SVMs can estimate quantiles and the median 2007

Rapid Inference on a Novel AND/OR graph for Object Detection, Segmentation and Parsing 2007