2016 JMLR JMLR 2016

Lenient Learning in Independent-Learner Stochastic Cooperative Games

Abstract

We introduce the Lenient Multiagent Reinforcement Learning 2 (LMRL2) algorithm for independent-learner stochastic cooperative games. LMRL2 is designed to overcome a pathology called relative overgeneralization, and to do so while still performing well in games with stochastic transitions, stochastic rewards, and miscoordination. We discuss the existing literature, then compare LMRL2 against other algorithms drawn from the literature which can be used for games of this kind: traditional (“Distributed”) Q-learning, Hysteretic Q-learning, WoLF-PHC, SOoN, and (for repeated games only) FMQ. The results show that LMRL2 is very effective in both of our measures (complete and correct policies), and is found in the top rank more often than any other technique. LMRL2 is also easy to tune: though it has many available parameters, almost all of them stay at default settings. Generally the algorithm is optimally tuned with a single parameter, if any. We then examine and discuss a number of side-issues and options for LMRL2. [abs] [ pdf ][ bib ] © JMLR 2016. (edit, beta)

🌉 Interdisciplinary Bridge — Artificial Intelligence and Reinforcement Learning
📈 Trend Setter — Multi-Agent Systems
🧭 Keyword Pioneer — relative overgeneralization
🐣 Hot Topic Early Bird — multi-agent reinforcement learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy

Authors