Open-ended learning in symmetric zero-sum games

David Balduzzi; Marta Garnelo; Yoram Bachrach; Wojciech Czarnecki; Julien Pérolat; Max Jaderberg; Thore Graepel

2019 ICML ICML 2019

Open-ended learning in symmetric zero-sum games

Abstract

Zero-sum games such as chess and poker are, abstractly, functions that evaluate pairs of agents, for example labeling them ‘winner’ and ‘loser’. If the game is approximately transitive, then self-play generates sequences of agents of increasing strength. However, nontransitive games, such as rock-paper-scissors, can exhibit strategic cycles, and there is no longer a clear objective – we want agents to increase in strength, but against whom is unclear. In this paper, we introduce a geometric framework for formulating agent objectives in zero-sum games, in order to construct adaptive sequences of objectives that yield open-ended learning. The framework allows us to reason about population performance in nontransitive games, and enables the development of a new algorithm (rectified Nash response, PSRO_rN) that uses game-theoretic niching to construct diverse populations of effective agents, producing a stronger set of agents than existing algorithms. We apply PSRO_rN to two highly nontransitive resource allocation games and find that PSRO_rN consistently outperforms the existing alternatives.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Reinforcement Learning

🧭 Keyword Pioneer — strategic cycle

🐣 Hot Topic Early Bird — multi-agent system

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics

Authors

David Balduzzi , Marta Garnelo , Yoram Bachrach , Wojciech Czarnecki , Julien Pérolat , Max Jaderberg , Thore Graepel

Topics

Artificial Intelligence > Core AI > Game AI Artificial Intelligence > Core AI > Multi-Agent Systems Machine Learning > Learning Types > Adversarial Learning Reinforcement Learning > Methods > Multi-Agent Systems Artificial Intelligence > Core AI > Game Theory

Keywords

game theory nash equilibrium zero-sum game strategic cycle population performance multi-agent system

Download PDF

Related papers

Bayesian leave-one-out cross-validation for large data 2019

A Block Coordinate Descent Proximal Method for Simultaneous Filtering and Parameter Estimation 2019

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks 2019

Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously 2019

Improved Convergence for $\ell_1$ and $\ell_∞$ Regression via Iteratively Reweighted Least Squares 2019