Online Linear Quadratic Control

Alon Cohen; Avinatan Hasidim; Tomer Koren; Nevena Lazic; Yishay Mansour; Kunal Talwar

2018 ICML ICML 2018

Online Linear Quadratic Control

Abstract

We study the problem of controlling linear time-invariant systems with known noisy dynamics and adversarially chosen quadratic losses. We present the first efficient online learning algorithms in this setting that guarantee $O(\sqrt{T})$ regret under mild assumptions, where $T$ is the time horizon. Our algorithms rely on a novel SDP relaxation for the steady-state distribution of the system. Crucially, and in contrast to previously proposed relaxations, the feasible solutions of our SDP all correspond to “strongly stable” policies that mix exponentially fast to a steady state.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Mathematics & Optimization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

📈 Trend Setter — Control Theory

🧭 Keyword Pioneer — stable policy

Authors

Alon Cohen , Avinatan Hasidim , Tomer Koren , Nevena Lazic , Yishay Mansour , Kunal Talwar

Topics

Artificial Intelligence > Core AI > Planning Machine Learning > Optimization & Theory > Optimization Robotics > Systems > Control Theory Mathematics & Optimization > Optimization > Online Algorithms Machine Learning > Learning Types > Online Learning Mathematics & Optimization > Optimization > Optimal Control

Keywords

online learning semidefinite programming regret bound online algorithm linear system linear quadratic control stable policy

Download PDF

Related papers

Rectify Heterogeneous Models with Semantic Mapping 2018

Bayesian Optimization of Combinatorial Structures 2018

The Well-Tempered Lasso 2018

Approximation Algorithms for Cascading Prediction Models 2018

Classification from Pairwise Similarity and Unlabeled Data 2018