Efficient Monte Carlo Counterfactual Regret Minimization in Games with Many Player Actions

Neil Burch; Marc Lanctot; Duane Szafron; Richard G. Gibson

2012 NIPS NeurIPS 2012

Efficient Monte Carlo Counterfactual Regret Minimization in Games with Many Player Actions

Abstract

Counterfactual Regret Minimization (CFR) is a popular, iterative algorithm for computing strategies in extensive-form games. The Monte Carlo CFR (MCCFR) variants reduce the per iteration time cost of CFR by traversing a sampled portion of the tree. The previous most effective instances of MCCFR can still be very slow in games with many player actions since they sample every action for a given player. In this paper, we present a new MCCFR algorithm, Average Strategy Sampling (AS), that samples a subset of the player's actions according to the player's average strategy. Our new algorithm is inspired by a new, tighter bound on the number of iterations required by CFR to converge to a given solution quality. In addition, we prove a similar, tighter bound for AS and other popular MCCFR variants. Finally, we validate our work by demonstrating that AS converges faster than previous MCCFR algorithms in both no-limit poker and Bluff.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Mathematics & Optimization and Reinforcement Learning

📈 Trend Setter — Game AI

🧭 Keyword Pioneer — bluff

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning

🐣 Hot Topic Early Bird — game theory

Authors

Neil Burch , Marc Lanctot , Duane Szafron , Richard G. Gibson

Topics

Artificial Intelligence > Core AI > Game AI Artificial Intelligence > Core AI > Multi-Agent Systems Reinforcement Learning > Methods > Multi-Agent Systems Reinforcement Learning > Applications > Game AI Mathematics & Optimization > Optimization > Online Algorithms Machine Learning > Learning Types > Multi-Agent Systems Mathematics & Optimization > Optimization > Game Theory Artificial Intelligence > Core AI > Game Theory Deep Learning > Learning Types > Reinforcement Learning

Keywords

game theory poker monte carlo sampling counterfactual regret minimization equilibrium computation bluff extensive-form game monte carlo method sequential game strategy convergence average strategy sampling

Download PDF

Related papers

Kernel Hyperalignment 2012

Fused sparsity and robust estimation for linear models with unknown variance 2012

Slice sampling normalized kernel-weighted completely random measure mixture models 2012

Scaling MPE Inference for Constrained Continuous Markov Random Fields with Consensus Optimization 2012

Matrix reconstruction with the local max norm 2012