Linear Complementarity for Regularized Policy Evaluation and Improvement

Jeffrey Johns; Christopher Painter-wakefield; Ronald Parr

2010 NIPS NeurIPS 2010

Linear Complementarity for Regularized Policy Evaluation and Improvement

Abstract

Recent work in reinforcement learning has emphasized the power of L1 regularization to perform feature selection and prevent overfitting. We propose formulating the L1 regularized linear fixed point problem as a linear complementarity problem (LCP). This formulation offers several advantages over the LARS-inspired formulation, LARS-TD. The LCP formulation allows the use of efficient off-the-shelf solvers, leads to a new uniqueness result, and can be initialized with starting points from similar problems (warm starts). We demonstrate that warm starts, as well as the efficiency of LCP solvers, can speed up policy iteration. Moreover, warm starts permit a form of modified policy iteration that can be used to approximate a greedy" homotopy path, a generalization of the LARS-TD homotopy path that combines policy evaluation and optimization."

🌉 Interdisciplinary Bridge — Machine Learning and Reinforcement Learning

🧭 Keyword Pioneer — linear complementarity

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

📈 Trend Setter — Optimization

🐣 Hot Topic Early Bird — reinforcement learning

Authors

Jeffrey Johns , Christopher Painter-wakefield , Ronald Parr

Topics

Machine Learning > Optimization & Theory > Optimization Reinforcement Learning > Methods > Deep RL Reinforcement Learning > Methods > Policy Learning Machine Learning > Learning Types > Reinforcement Learning Mathematics & Optimization > Optimization > Optimization

Keywords

reinforcement learning policy evaluation feature selection l1 regularization policy iteration policy improvement linear complementarity homotopy path linear complementarity problem

Download PDF

Related papers

Link Discovery using Graph Feature Tracking 2010

Trading off Mistakes and Don't-Know Predictions 2010

A Novel Kernel for Learning a Neuron Model from Spike Train Data 2010

Decomposing Isotonic Regression for Efficiently Solving Large Problems 2010

Learning Kernels with Radiuses of Minimum Enclosing Balls 2010