2010
NIPS
NeurIPS 2010
LSTD with Random Projections
Abstract
We consider the problem of reinforcement learning in high-dimensional spaces when the number of features is bigger than the number of samples. In particular, we study the least-squares temporal difference (LSTD) learning algorithm when a space of low dimension is generated with a random projection from a high-dimensional space. We provide a thorough theoretical analysis of the LSTD with random projections and derive performance bounds for the resulting algorithm. We also show how the error of LSTD with random projections is propagated through the iterations of a policy iteration algorithm and provide a performance bound for the resulting least-squares policy iteration (LSPI) algorithm.
🌉
Interdisciplinary Bridge
— Machine Learning and Reinforcement Learning
🧭
Keyword Pioneer
— least-squares temporal difference
🐣
Hot Topic Early Bird
— reinforcement learning
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy
📈
Trend Setter
— Value Iteration
Authors
Topics
Machine Learning > Optimization & Theory > Optimization
Machine Learning > Optimization & Theory > Statistical Learning
Machine Learning > Optimization & Theory > Theory
Reinforcement Learning > Methods > Deep RL
Machine Learning > Core Methods > Dimensionality Reduction
Machine Learning > Learning Types > Reinforcement Learning
Reinforcement Learning > Methods > Value Iteration
Deep Learning > Optimization & Theory > Stochastic Methods