Regret Guarantees for Online Deep Control

Xinyi Chen; Edgar Minasyan; Jason D. Lee; Elad Hazan

2023 L4DC L4DC 2023

Regret Guarantees for Online Deep Control

Abstract

Despite the immense success of deep learning in reinforcement learning and control, few theoretical guarantees for neural networks exist for these problems. Deriving performance guarantees is challenging because control is an online problem with no distributional assumptions and an agnostic learning objective, while the theory of deep learning so far focuses on supervised learning with a fixed known training set. In this work, we begin to resolve these challenges and derive the first regret guarantees in online control over a neural network-based policy class. In particular, we show sublinear episodic regret guarantees against a policy class parameterized by deep neural networks, a much richer class than previously considered linear policy parameterizations. Our results center on a reduction from online learning of neural networks to online convex optimization (OCO), and can use any OCO algorithm as a blackbox. Since online learning guarantees are inherently agnostic, we need to quantify the performance of the best policy in our policy class. To this end, we introduce the interpolation dimension, an expressivity metric, which we use to accompany our regret bounds. The results and findings in online deep learning are of independent interest and may have applications beyond online control.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Xinyi Chen , Edgar Minasyan , Jason D. Lee , Elad Hazan

Topics

Machine Learning > Optimization & Theory > Learning Theory Machine Learning > Optimization & Theory > Neural Network Optimization Machine Learning > Optimization & Theory > Online Algorithms

Keywords

reinforcement learning neural network optimization online convex optimization regret bound deep neural network online control

Download PDF

Related papers

Model-Based Reinforcement Learning for Cavity Filter Tuning 2023

Learning on Manifolds: Universal Approximations Properties using Geometric Controllability Conditions for Neural ODEs 2023

Hyperparameter Tuning of an Off-Policy Reinforcement Learning Algorithm for H∞ Tracking Control 2023

Policy Learning for Active Target Tracking over Continuous $SE(3)$ Trajectories 2023

Automated Reachability Analysis of Neural Network-Controlled Systems via Adaptive Polytopes 2023