Model and Reinforcement Learning for Markov Games with Risk Preferences

Wenjie Huang; Viet Hai Pham; William Benjamin Haskell

2020 AAAI AAAI 2020

Model and Reinforcement Learning for Markov Games with Risk Preferences

Abstract

Abstract We motivate and propose a new model for non-cooperative Markov game which considers the interactions of risk-aware players. This model characterizes the time-consistent dynamic “risk” from both stochastic state transitions (inherent to the game) and randomized mixed strategies (due to all other players). An appropriate risk-aware equilibrium concept is proposed and the existence of such equilibria is demonstrated in stationary strategies by an application of Kakutani's fixed point theorem. We further propose a simulation-based Q-learning type algorithm for risk-aware equilibrium computation. This algorithm works with a special form of minimax risk measures which can naturally be written as saddle-point stochastic optimization problems, and covers many widely investigated risk measures. Finally, the almost sure convergence of this simulation-based algorithm to an equilibrium is demonstrated under some mild conditions. Our numerical experiments on a two player queuing game validate the properties of our model and algorithm, and demonstrate their worth and applicability in real life competitive decision-making.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Mathematics & Optimization and Reinforcement Learning

🧭 Keyword Pioneer — risk-aware equilibrium

🐣 Hot Topic Early Bird — markov game

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Wenjie Huang , Viet Hai Pham , William Benjamin Haskell

Topics

Artificial Intelligence > Core AI > Game AI Machine Learning > Optimization & Theory > Stochastic Processes Reinforcement Learning > Methods > Deep RL Reinforcement Learning > Methods > Multi-Agent Systems Machine Learning > Learning Types > Reinforcement Learning Mathematics & Optimization > Optimization > Game Theory Artificial Intelligence > Core AI > Game Theory Deep Learning > Learning Types > Reinforcement Learning

Keywords

stochastic optimization game theory equilibrium computation markov game risk measure saddle-point optimization multi-agent system risk-aware equilibrium risk-aware decision making minimax risk measure

Download PDF

Related papers

Enhancing Pointer Network for Sentence Ordering with Pairwise Ordering Predictions 2020

CopyMTL: Copy Mechanism for Joint Extraction of Entities and Relations with Multi-Task Learning 2020

Neural Simile Recognition with Cyclic Multitask Learning and Local Attention 2020

Being Optimistic to Be Conservative: Quickly Learning a CVaR Policy 2020

Multi-Point Semantic Representation for Intent Classification 2020