Limiting Behaviors of Nonconvex-Nonconcave Minimax Optimization via Continuous-Time Systems

Benjamin Grimmer; Haihao Lu; Pratik Worah; Vahab Mirrokni

2022 ALT ALT 2022

Limiting Behaviors of Nonconvex-Nonconcave Minimax Optimization via Continuous-Time Systems

Abstract

Unlike nonconvex optimization, where gradient descent is guaranteed to converge to a local optimizer, algorithms for nonconvex-nonconcave minimax optimization can have topologically different solution paths: sometimes converging to a solution, sometimes never converging and instead following a limit cycle, and sometimes diverging. In this paper, we study the limiting behaviors of three classic minimax algorithms: gradient descent ascent (GDA), alternating gradient descent ascent (AGDA), and the extragradient method (EGM). Numerically, we observe that all of these limiting behaviors can arise in Generative Adversarial Networks (GAN) training and are easily demonstrated even in simple GAN models. To explain these different behaviors, we study the high-order resolution continuous-time dynamics that correspond to each algorithm, which results in sufficient (and almost necessary) conditions for the local convergence by each method. Moreover, this ODE perspective allows us to characterize the phase transition between these potentially nonconvergent limiting behaviors caused by introducing regularization in the problem instance.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Benjamin Grimmer , Haihao Lu , Pratik Worah , Vahab Mirrokni

Topics

Machine Learning > Learning Types > Adversarial Learning Deep Learning > Models > Generative Models

Keywords

minimax optimization generative adversarial network gradient descent ascent continuous-time dynamics limit cycle

Download PDF

Related papers

Efficient and Optimal Fixed-Time Regret with Two Experts 2022

The Mirror Langevin Algorithm Converges with Vanishing Bias 2022

Infinitely Divisible Noise in the Low Privacy Regime 2022

Metric Entropy Duality and the Sample Complexity of Outcome Indistinguishability 2022

Universally Consistent Online Learning with Arbitrarily Dependent Responses 2022