Fast Last-Iterate Convergence of Learning in Games Requires Forgetful Algorithms

Yang Cai; Gabriele Farina; Julien Grand-Clément; Christian Kroer; Chung-Wei Lee; Haipeng Luo; Weiqiang Zheng

2024 NIPS NeurIPS 2024

Fast Last-Iterate Convergence of Learning in Games Requires Forgetful Algorithms

Abstract

Self play via online learning is one of the premier ways to solve large-scale zero-sum games, both in theory and practice. Particularly popular algorithms include optimistic multiplicative weights update (OMWU) and optimistic gradient-descent-ascent (OGDA). While both algorithms enjoy $O(1/T)$ ergodic convergence to Nash equilibrium in two-player zero-sum games, OMWU offers several advantages, including logarithmic dependence on the size of the payoff matrix and $\tilde{O}(1/T)$ convergence to coarse correlated equilibria even in general-sum games. However, in terms of last-iterate convergence in two-player zero-sum games, an increasingly popular topic in this area, OGDA guarantees that the duality gap shrinks at a rate of $(1/\sqrt{T})$, while the best existing last-iterate convergence for OMWU depends on some game-dependent constant that could be arbitrarily large. This begs the question: is this potentially slow last-iterate convergence an inherent disadvantage of OMWU, or is the current analysis too loose? Somewhat surprisingly, we show that the former is true. More generally, we prove that a broad class of algorithms that do not forget the past quickly all suffer the same issue: for any arbitrarily small $\delta>0$, there exists a $2\times 2$ matrix game such that the algorithm admits a constant duality gap even after $1/\delta$ rounds. This class of algorithms includes OMWU and other standard optimistic follow-the-regularized-leader algorithms.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Mathematics & Optimization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

🧭 Keyword Pioneer — optimistic multiplicative weight

Authors

Yang Cai , Gabriele Farina , Julien Grand-Clément , Christian Kroer , Chung-Wei Lee , Haipeng Luo , Weiqiang Zheng

Topics

Artificial Intelligence > Core AI > Game AI Artificial Intelligence > Core AI > Multi-Agent Systems Machine Learning > Optimization & Theory > Optimization Mathematics & Optimization > Optimization > Online Algorithms Machine Learning > Optimization & Theory > Online Algorithms Mathematics & Optimization > Optimization > Game Theory Artificial Intelligence > Core AI > Game Theory

Keywords

online learning game theory nash equilibrium zero-sum game last-iterate convergence duality gap multiplicative weights update optimistic gradient descent optimistic multiplicative weight coarse correlated equilibria

Download PDF

Related papers

SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers 2024

Training for Stable Explanation for Free 2024

NeuralSolver: Learning Algorithms For Consistent and Efficient Extrapolation Across General Tasks 2024

Expectation Alignment: Handling Reward Misspecification in the Presence of Expectation Mismatch 2024

MicroAdam: Accurate Adaptive Optimization with Low Space Overhead and Provable Convergence 2024