Learning the Linear Quadratic Regulator from Nonlinear Observations

Zakaria Mhammedi; Dylan J Foster; Max Simchowitz; Dipendra Misra; Wen Sun; Akshay Krishnamurthy; Alexander Rakhlin; John Langford

2020 NIPS NeurIPS 2020

Learning the Linear Quadratic Regulator from Nonlinear Observations

Abstract

We introduce a new problem setting for continuous control called the LQR with Rich Observations, or RichLQR. In our setting, the environment is summarized by a low-dimensional continuous latent state with linear dynamics and quadratic costs, but the agent operates on high-dimensional, nonlinear observations such as images from a camera. To enable sample-efficient learning, we assume that the learner has access to a class of decoder functions (e.g., neural networks) that is flexible enough to capture the mapping from observations to latent states. We introduce a new algorithm, RichID, which learns a near-optimal policy for the RichLQR with sample complexity scaling only with the dimension of the latent state space and the capacity of the decoder function class. RichID is oracle-efficient and accesses the decoder class only through calls to a least-squares regression oracle. To our knowledge, our results constitute the first provable sample complexity guarantee for continuous control with an unknown nonlinearity in the system model.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Reinforcement Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Zakaria Mhammedi , Dylan J Foster , Max Simchowitz , Dipendra Misra , Wen Sun , Akshay Krishnamurthy , Alexander Rakhlin , John Langford

Topics

Artificial Intelligence > Core AI > Planning Machine Learning > Optimization & Theory > Statistical Learning Reinforcement Learning > Methods > Deep RL Artificial Intelligence > Core AI > Robotics Artificial Intelligence > Core AI > Reinforcement Learning

Keywords

sample complexity continuous control linear quadratic regulator latent state sample-efficient learning nonlinear observation

Download PDF

Related papers

Higher-Order Spectral Clustering of Directed Graphs 2020

Self-Supervised MultiModal Versatile Networks 2020

Multi-Robot Collision Avoidance under Uncertainty with Probabilistic Safety Barrier Certificates 2020

Causal Intervention for Weakly-Supervised Semantic Segmentation 2020

Taming Discrete Integration via the Boon of Dimensionality 2020