Faster Game Solving via Asymmetry of Step Sizes
Abstract
Abstract Counterfactual Regret Minimization (CFR) algorithms are widely used to compute a Nash equilibrium (NE) in two-player zero-sum imperfect-information extensive-form games (IIGs). Among them, Predictive CFR+ (PCFR+) is particularly powerful, achieving an exceptionally fast empirical convergence rate via the prediction in many games. However, the empirical convergence rate of PCFR+ would significantly degrade if the prediction is inaccurate, leading to unstable performance on certain IIGs. To enhance the robustness of PCFR+, we propose Asymmetric PCFR+ (APCFR+), which employs an adaptive asymmetry of step sizes between the updates of implicit and explicit accumulated counterfactual regrets to mitigate the impact of the prediction inaccuracy on convergence. We present a theoretical analysis demonstrating why APCFR+ can enhance the robustness. To the best of our knowledge, we are the first to propose the asymmetry of step sizes, a simple yet novel technique that effectively improves the robustness of PCFR+. Then, to reduce the difficulty of implementing APCFR+ caused by the adaptive asymmetry, we propose a simplified version of APCFR+ called Simple APCFR+ (SAPCFR+), which uses a fixed asymmetry of step sizes to enable only a single-line modification compared to original PCFR+. Experimental results on five standard IIG benchmarks and two heads-up no-limit Texas Hold’em (HUNL) Subagems show that (i) both APCFR+ and SAPCFR+ outperform PCFR+ in most of the tested games, (ii) SAPCFR+ achieves a comparable empirical convergence rate with APCFR+, and (iii) our approach can be generalized to improve other CFR algorithms, e.g., Discount CFR (DCFR).