Scalable Reinforcement Learning of Localized Policies for Multi-Agent Networked Systems

Guannan Qu; Adam Wierman; Na Li

2020 L4DC L4DC 2020

Scalable Reinforcement Learning of Localized Policies for Multi-Agent Networked Systems

Abstract

We study reinforcement learning (RL) in a setting with a network of agents whose states and actions interact in a local manner where the objective is to find localized policies such that the (discounted) global reward is maximized. A fundamental challenge in this setting is that the state-action space size scales exponentially in the number of agents, rendering the problem intractable for large networks. In this paper, we propose a Scalable Actor Critic (SAC) framework that exploits the network structure and finds a localized policy that is an $O(\rho^\kappa)$-approximation of a stationary point of the objective for some $\rho\in(0,1)$, with complexity that scales with the local state-action space size of the largest $\kappa$-hop neighborhood of the network.

🚀 Conference Pioneer — L4DC 2020

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Reinforcement Learning

🧭 Keyword Pioneer — networked system

🐣 Hot Topic Early Bird — multi-agent reinforcement learning

🐝 Cross-Pollinator — Artificial Intelligence, Machine Learning, Mathematics & Optimization, Reinforcement Learning, Robotics