Regularized Parameter Uncertainty for Improving Generalization in Reinforcement Learning

Pehuen Moure; Longbiao Cheng; Joachim Ott; Zuowen Wang; Shih-Chii Liu

2024 CVPR CVPR 2024

Regularized Parameter Uncertainty for Improving Generalization in Reinforcement Learning

Abstract

In order for reinforcement learning (RL) agents to be deployed in real-world environments they must be able to generalize to unseen environments. However RL struggles with out-of-distribution generalization often due to over-fitting the particulars of the training environment. Although regularization techniques from supervised learning can be applied to avoid over-fitting the differences between supervised learning and RL limit their application. To address this we propose the Signal-to-Noise Ratio regulated Parameter Uncertainty Network (SNR PUN) for RL. We introduce SNR as a new measure of regularizing the parameter uncertainty of a network and provide a formal analysis explaining why SNR regularization works well for RL. We demonstrate the effectiveness of our proposed method to generalize in several simulated environments; and in a physical system showing the possibility of using SNR PUN for applying RL to real-world applications.

🌉 Interdisciplinary Bridge — Machine Learning and Reinforcement Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Pehuen Moure , Longbiao Cheng , Joachim Ott , Zuowen Wang , Shih-Chii Liu

Topics

Machine Learning > Application Areas > Domain Generalization Reinforcement Learning > Applications > Robotics

Keywords

reinforcement learning out-of-distribution generalization parameter uncertainty neural network regularization

Download PDF

Related papers

DUSt3R: Geometric 3D Vision Made Easy 2024

Bezier Everywhere All at Once: Learning Drivable Lanes as Bezier Graphs 2024

NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows 2024

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization 2024

DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models 2024