2020
UAI
UAI 2020
Learning Intrinsic Rewards as a Bi-Level Optimization Problem
Abstract
We reinterpret the problem of finding intrinsic rewards in reinforcement learning (RL) as a bilevel optimization problem. Using this interpretation, we can make use of recent advancements in the hyperparameter optimization literature, mainly from Self-Tuning Networks (STN), to learn intrinsic rewards. To facilitate our methods, we introduces a new general conditioning layer: Conditional Layer Normalization (CLN). We evaluate our method on several continuous control benchmarks in the Mujoco physics simulator. On all of these benchmarks, the intrinsic rewards learned on the fly lead to higher final rewards.
🌉
Interdisciplinary Bridge
— Machine Learning and Reinforcement Learning
🧭
Keyword Pioneer
— conditional layer normalization
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics
🐣
Hot Topic Early Bird
— bilevel optimization
Authors
Topics
Machine Learning > Optimization & Theory > Optimization
Reinforcement Learning
Reinforcement Learning > Methods > Policy Learning
Machine Learning > Learning Types > Reinforcement Learning
Mathematics & Optimization > Optimization > Optimization
Artificial Intelligence > Core AI > Reinforcement Learning