Strategically efficient exploration in competitive multi-agent reinforcement learning

Robert Loftin; Aadirupa Saha; Sam Devlin; Katja Hofmann

2021 UAI UAI 2021

Strategically efficient exploration in competitive multi-agent reinforcement learning

Abstract

High sample complexity remains a barrier to the application of reinforcement learning (RL), particularly in multi-agent systems. A large body of work has demonstrated that exploration mechanisms based on the principle of optimism under uncertainty can significantly improve the sample efficiency of RL in single agent tasks. This work seeks to understand the role of optimistic exploration in non-cooperative multi-agent settings. We will show that, in zero-sum games, optimistic exploration can cause the learner to waste time sampling parts of the state space that are irrelevant to strategic play, as they can only be reached through cooperation between both players. To address this issue, we introduce a formal notion of strategically efficient exploration in Markov games, and use this to develop two strategically efficient learning algorithms for finite Markov games. We demonstrate that these methods can be significantly more sample efficient than their optimistic counterparts.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Reinforcement Learning

🧭 Keyword Pioneer — strategic efficiency

🐣 Hot Topic Early Bird — markov game

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy

Authors

Robert Loftin , Aadirupa Saha , Sam Devlin , Katja Hofmann

Topics

Artificial Intelligence > Core AI > Multi-Agent Systems Machine Learning > Optimization & Theory > Learning Theory Reinforcement Learning > Methods > Multi-Agent Systems

Keywords

multi-agent reinforcement learning sample efficiency optimistic exploration zero-sum game markov game strategic efficiency

Download PDF

Related papers

Efficient greedy coordinate descent via variable partitioning 2021

Multi-output Gaussian Processes for uncertainty-aware recommender systems 2021

Constrained differentially private federated learning for low-bandwidth devices 2021

Matrix games with bandit feedback 2021

A weaker faithfulness assumption based on triple interactions 2021