Efficient Reinforcement Learning for High Dimensional Linear Quadratic Systems

Morteza Ibrahimi; Adel Javanmard; Benjamin V. Roy

2012 NIPS NeurIPS 2012

Efficient Reinforcement Learning for High Dimensional Linear Quadratic Systems

Abstract

We study the problem of adaptive control of a high dimensional linear quadratic (LQ) system. Previous work established the asymptotic convergence to an optimal controller for various adaptive control schemes. More recently, an asymptotic regret bound of $\tilde{O}(\sqrt{T})$ was shown for $T \gg p$ where $p$ is the dimension of the state space. In this work we consider the case where the matrices describing the dynamic of the LQ system are sparse and their dimensions are large. We present an adaptive control scheme that for $p \gg 1$ and $T \gg \polylog(p)$ achieves a regret bound of $\tilde{O}(p \sqrt{T})$. In particular, our algorithm has an average cost of $(1+\eps)$ times the optimum cost after $T = \polylog(p) O(1/\eps^2)$. This is in comparison to previous work on the dense dynamics where the algorithm needs $\Omega(p)$ samples before it can estimate the unknown dynamic with any significant accuracy. We believe our result has prominent applications in the emerging area of computational advertising, in particular targeted online advertising and advertising in social networks.

🧭 Keyword Pioneer — linear quadratic

🐣 Hot Topic Early Bird — reinforcement learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy

🌉 Interdisciplinary Bridge — Artificial Intelligence and Mathematics & Optimization and Reinforcement Learning

📈 Trend Setter — Reinforcement Learning

Authors

Morteza Ibrahimi , Adel Javanmard , Benjamin V. Roy

Topics

Reinforcement Learning > Methods > Deep RL Reinforcement Learning > Methods > Policy Learning Reinforcement Learning > Applications > Robotics Mathematics & Optimization > Optimization > Online Algorithms Artificial Intelligence > Core AI > Reinforcement Learning

Keywords

reinforcement learning online learning stochastic control linear quadratic regulator sparse matrix adaptive control linear quadratic high dimensional systems regret bound optimal controller

Download PDF

Related papers

Kernel Hyperalignment 2012

Fused sparsity and robust estimation for linear models with unknown variance 2012

Slice sampling normalized kernel-weighted completely random measure mixture models 2012

Scaling MPE Inference for Constrained Continuous Markov Random Fields with Consensus Optimization 2012

Matrix reconstruction with the local max norm 2012