Credit Assignment For Collective Multiagent RL With Global Rewards

Duc Thien Nguyen; Akshat Kumar; Hoong Chuin Lau

2018 NIPS NeurIPS 2018

Credit Assignment For Collective Multiagent RL With Global Rewards

Abstract

Scaling decision theoretic planning to large multiagent systems is challenging due to uncertainty and partial observability in the environment. We focus on a multiagent planning model subclass, relevant to urban settings, where agent interactions are dependent on their ``collective influence'' on each other, rather than their identities. Unlike previous work, we address a general setting where system reward is not decomposable among agents. We develop collective actor-critic RL approaches for this setting, and address the problem of multiagent credit assignment, and computing low variance policy gradient estimates that result in faster convergence to high quality solutions. We also develop difference rewards based credit assignment methods for the collective setting. Empirically our new approaches provide significantly better solutions than previous methods in the presence of global rewards on two real world problems modeling taxi fleet optimization and multiagent patrolling, and a synthetic grid navigation domain.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Reinforcement Learning

🧭 Keyword Pioneer — collective learning

🐣 Hot Topic Early Bird — multi-agent reinforcement learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Duc Thien Nguyen , Akshat Kumar , Hoong Chuin Lau

Topics

Artificial Intelligence > Core AI > Multi-Agent Systems Reinforcement Learning > Methods > Policy Learning Reinforcement Learning > Methods > Multi-Agent Systems Machine Learning > Learning Types > Multi-Agent Systems

Keywords

multi-agent reinforcement learning policy gradient credit assignment collective learning global reward

Download PDF

Related papers

Maximum Causal Tsallis Entropy Imitation Learning 2018

Recurrent World Models Facilitate Policy Evolution 2018

Bandit Learning in Concave N-Person Games 2018

Algorithmic Assurance: An Active Approach to Algorithmic Testing using Bayesian Optimisation 2018

PAC-Bayes bounds for stable algorithms with instance-dependent priors 2018