Policy Gradient Methods Find the Nash Equilibrium in N-player General-sum Linear-quadratic Games

Ben Hambly; Renyuan Xu; Huining Yang

2023 JMLR JMLR 2023

Policy Gradient Methods Find the Nash Equilibrium in N-player General-sum Linear-quadratic Games

Abstract

We consider a general-sum N-player linear-quadratic game with stochastic dynamics over a finite horizon and prove the global convergence of the natural policy gradient method to the Nash equilibrium. In order to prove convergence of the method we require a certain amount of noise in the system. We give a condition, essentially a lower bound on the covariance of the noise in terms of the model parameters, in order to guarantee convergence. We illustrate our results with numerical experiments to show that even in situations where the policy gradient method may not converge in the deterministic setting, the addition of noise leads to convergence. [abs] [ pdf ][ bib ] © JMLR 2023. (edit, beta)

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Mathematics & Optimization and Reinforcement Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ben Hambly , Renyuan Xu , Huining Yang

Topics

Artificial Intelligence > Core AI > Multi-Agent Systems Reinforcement Learning > Methods > Policy Learning Reinforcement Learning > Methods > Multi-Agent Systems Machine Learning > Learning Types > Multi-Agent Systems Mathematics & Optimization > Optimization > Game Theory

Keywords

policy gradient global convergence nash equilibrium general-sum game multi-agent system linear-quadratic game

Download PDF

Related papers

Flexible Model Aggregation for Quantile Regression 2023

Efficient Computation of Rankings from Pairwise Comparisons 2023

Efficient Structure-preserving Support Tensor Train Machine 2023

Attacks against Federated Learning Defense Systems and their Mitigation 2023

How Do You Want Your Greedy: Simultaneous or Repeated? 2023