Quantifying Generalization in Reinforcement Learning

Karl Cobbe; Oleg Klimov; Chris Hesse; Taehoon Kim; John Schulman

2019 ICML ICML 2019

Quantifying Generalization in Reinforcement Learning

Abstract

In this paper, we investigate the problem of overfitting in deep reinforcement learning. Among the most common benchmarks in RL, it is customary to use the same environments for both training and testing. This practice offers relatively little insight into an agent’s ability to generalize. We address this issue by using procedurally generated environments to construct distinct training and test sets. Most notably, we introduce a new environment called CoinRun, designed as a benchmark for generalization in RL. Using CoinRun, we find that agents overfit to surprisingly large training sets. We then show that deeper convolutional architectures improve generalization, as do methods traditionally found in supervised learning, including L2 regularization, dropout, data augmentation and batch normalization.

🌉 Interdisciplinary Bridge — Machine Learning and Reinforcement Learning

📈 Trend Setter — Domain Generalization

🧭 Keyword Pioneer — convolutional architecture

🐣 Hot Topic Early Bird — domain generalization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors

Karl Cobbe , Oleg Klimov , Chris Hesse , Taehoon Kim , John Schulman

Topics

Machine Learning > Application Areas > Domain Generalization Reinforcement Learning > Methods > Deep RL Machine Learning > Optimization & Theory > Generalization Machine Learning > Learning Types > Generalization

Keywords

deep reinforcement learning domain generalization data augmentation procedural generation convolutional architecture procedurally generated environment

Download PDF

Related papers

Bayesian leave-one-out cross-validation for large data 2019

A Block Coordinate Descent Proximal Method for Simultaneous Filtering and Parameter Estimation 2019

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks 2019

Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously 2019

Improved Convergence for $\ell_1$ and $\ell_∞$ Regression via Iteratively Reweighted Least Squares 2019