A new regret analysis for Adam-type algorithms

Ahmet Alacaoglu; Yura Malitsky; Panayotis Mertikopoulos; Volkan Cevher

2020 ICML ICML 2020

A new regret analysis for Adam-type algorithms

Abstract

In this paper, we focus on a theory-practice gap for Adam and its variants (AMSGrad, AdamNC, etc.). In practice, these algorithms are used with a constant first-order moment parameter $\beta_{1}$ (typically between $0.9$ and $0.99$). In theory, regret guarantees for online convex optimization require a rapidly decaying $\beta_{1}\to0$ schedule. We show that this is an artifact of the standard analysis, and we propose a novel framework that allows us to derive optimal, data-dependent regret bounds with a constant $\beta_{1}$, without further assumptions. We also demonstrate the flexibility of our analysis on a wide range of different algorithms and settings.

🧭 Keyword Pioneer — first-order moment

🐣 Hot Topic Early Bird — online convex optimization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy

Authors

Ahmet Alacaoglu , Yura Malitsky , Panayotis Mertikopoulos , Volkan Cevher

Topics

Machine Learning > Optimization & Theory > Learning Theory Machine Learning > Optimization & Theory > Neural Network Optimization

Keywords

online convex optimization regret bound adaptive gradient first-order moment

Download PDF

Related papers

Correlation Clustering with Asymmetric Classification Errors 2020

Learning Portable Representations for High-Level Planning 2020

Proving the Lottery Ticket Hypothesis: Pruning is All You Need 2020

Minimax Pareto Fairness: A Multi Objective Perspective 2020

DeepMatch: Balancing Deep Covariate Representations for Causal Inference Using Adversarial Training 2020