Bounding Regret in Empirical Games

Steven Jecmen; Arunesh Sinha; Zun Li; Long Tran-Thanh

2020 AAAI AAAI 2020

Bounding Regret in Empirical Games

Abstract

Abstract Empirical game-theoretic analysis refers to a set of models and techniques for solving large-scale games. However, there is a lack of a quantitative guarantee about the quality of output approximate Nash equilibria (NE). A natural quantitative guarantee for such an approximate NE is the regret in the game (i.e. the best deviation gain). We formulate this deviation gain computation as a multi-armed bandit problem, with a new optimization goal unlike those studied in prior work. We propose an efficient algorithm Super-Arm UCB (SAUCB) for the problem and a number of variants. We present sample complexity results as well as extensive experiments that show the better performance of SAUCB compared to several baselines.

🌉 Interdisciplinary Bridge — Mathematics & Optimization and Reinforcement Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Steven Jecmen , Arunesh Sinha , Zun Li , Long Tran-Thanh

Topics

Reinforcement Learning > Applications > Game AI Mathematics & Optimization > Optimization > Online Algorithms

Keywords

sample complexity multi-armed bandit regret bound empirical game theory approximate nash equilibrium

Download PDF

Related papers

Enhancing Pointer Network for Sentence Ordering with Pairwise Ordering Predictions 2020

CopyMTL: Copy Mechanism for Joint Extraction of Entities and Relations with Multi-Task Learning 2020

Neural Simile Recognition with Cyclic Multitask Learning and Local Attention 2020

Being Optimistic to Be Conservative: Quickly Learning a CVaR Policy 2020

Multi-Point Semantic Representation for Intent Classification 2020