Mutation-driven follow the regularized leader for last-iterate convergence in zero-sum games

Kenshi Abe; Mitsuki Sakamoto; Atsushi Iwasaki

2022 UAI UAI 2022

Mutation-driven follow the regularized leader for last-iterate convergence in zero-sum games

Abstract

In this study, we consider a variant of the Follow the Regularized Leader (FTRL) dynamics in two-player zero-sum games. FTRL is guaranteed to converge to a Nash equilibrium when time-averaging the strategies, while a lot of variants suffer from the issue of limit cycling behavior, i.e., lack the last-iterate convergence guarantee. To this end, we propose mutant FTRL (M-FTRL), an algorithm that introduces mutation for the perturbation of action probabilities. We then investigate the continuous-time dynamics of M-FTRL and provide the strong convergence guarantees toward stationary points that approximate Nash equilibria under full-information feedback. Furthermore, our simulation demonstrates that M-FTRL can enjoy faster convergence rates than FTRL and optimistic FTRL under full-information feedback and surprisingly exhibits clear convergence under bandit feedback.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Mathematics & Optimization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy

Authors

Kenshi Abe , Mitsuki Sakamoto , Atsushi Iwasaki

Topics

Artificial Intelligence > Core AI > Game AI Mathematics & Optimization > Optimization > Continuous Optimization

Keywords

nash equilibrium convergence guarantee zero-sum game

Download PDF

Related papers

Combating the instability of mutual information-based losses via regularization 2022

Future gradient descent for adapting the temporal shifting data distribution in online recommendation systems 2022

Privacy-aware compression for federated data analysis 2022

Fixing the Bethe approximation: How structural modifications in a graph improve belief propagation 2022

Probabilistic spatial transformer networks 2022