Hyperparameter Tuning of an Off-Policy Reinforcement Learning Algorithm for H∞ Tracking Control

Alireza Farahmandi; Brian C Reitz; Mark Debord; Douglas Philbrick; Katia Estabridis; Gary Hewer

2023 L4DC L4DC 2023

Hyperparameter Tuning of an Off-Policy Reinforcement Learning Algorithm for H∞ Tracking Control

Abstract

In this work, we present the hyperparameter optimization of an online, off-policy reinforcement learning algorithm based on a parallel search. Since this model-free learning algorithm solves the H∞ optimal tracking problem iteratively using ordinary least squares regression, we propose using the condition number of the data matrix as a model-free measure for tuning the hyperparameters. This addition enables automated optimization of the involved hyperparameters. We demonstrate that the condition number is a useful metric for tuning the number of collected samples, sampling interval, and other hyperparameters involved. In addition, we demonstrate a correlation between this condition number and properties of the sum of sinusoids persistent excitation.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy

Authors

Alireza Farahmandi , Brian C Reitz , Mark Debord , Douglas Philbrick , Katia Estabridis , Gary Hewer

Topics

Reinforcement Learning > Methods > Deep RL Reinforcement Learning > Applications > Robotics

Keywords

least squares regression off-policy reinforcement learning hyperparameter tuning condition number h-infinity control tracking control

Download PDF

Related papers

Model-Based Reinforcement Learning for Cavity Filter Tuning 2023

Learning on Manifolds: Universal Approximations Properties using Geometric Controllability Conditions for Neural ODEs 2023

Policy Learning for Active Target Tracking over Continuous $SE(3)$ Trajectories 2023

Automated Reachability Analysis of Neural Network-Controlled Systems via Adaptive Polytopes 2023

Template-Based Piecewise Affine Regression 2023