Lenient Learning in Independent-Learner Stochastic Cooperative Games

Ermo Wei; Sean Luke

2016 JMLR JMLR 2016

Lenient Learning in Independent-Learner Stochastic Cooperative Games

Abstract

We introduce the Lenient Multiagent Reinforcement Learning 2 (LMRL2) algorithm for independent-learner stochastic cooperative games. LMRL2 is designed to overcome a pathology called relative overgeneralization, and to do so while still performing well in games with stochastic transitions, stochastic rewards, and miscoordination. We discuss the existing literature, then compare LMRL2 against other algorithms drawn from the literature which can be used for games of this kind: traditional (âDistributedâ) Q-learning, Hysteretic Q-learning, WoLF-PHC, SOoN, and (for repeated games only) FMQ. The results show that LMRL2 is very effective in both of our measures (complete and correct policies), and is found in the top rank more often than any other technique. LMRL2 is also easy to tune: though it has many available parameters, almost all of them stay at default settings. Generally the algorithm is optimally tuned with a single parameter, if any. We then examine and discuss a number of side-issues and options for LMRL2. [abs] [ pdf ][ bib ] © JMLR 2016. (edit, beta)

🌉 Interdisciplinary Bridge — Artificial Intelligence and Reinforcement Learning

📈 Trend Setter — Multi-Agent Systems

🧭 Keyword Pioneer — relative overgeneralization

🐣 Hot Topic Early Bird — multi-agent reinforcement learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy

Authors

Ermo Wei , Sean Luke

Topics

Artificial Intelligence > Core AI > Multi-Agent Systems Reinforcement Learning > Methods > Multi-Agent Systems

Keywords

multi-agent reinforcement learning stochastic reward cooperative game relative overgeneralization

Download PDF

Related papers

Trend Filtering on Graphs 2016

Causal Inference through a Witness Protection Program 2016

A Characterization of Linkage-Based Hierarchical Clustering 2016

How to Center Deep Boltzmann Machines 2016

Minimax Rates in Permutation Estimation for Feature Matching 2016