Efficient Offline Communication Policies for Factored Multiagent POMDPs

João V. Messias; Matthijs Spaan; Pedro U. Lima

2011 NIPS NeurIPS 2011

Efficient Offline Communication Policies for Factored Multiagent POMDPs

Abstract

Factored Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) form a powerful framework for multiagent planning under uncertainty, but optimal solutions require a rigid history-based policy representation. In this paper we allow inter-agent communication which turns the problem in a centralized Multiagent POMDP (MPOMDP). We map belief distributions over state factors to an agent's local actions by exploiting structure in the joint MPOMDP policy. The key point is that when sparse dependencies between the agents' decisions exist, often the belief over its local state factors is sufficient for an agent to unequivocally identify the optimal action, and communication can be avoided. We formalize these notions by casting the problem into convex optimization form, and present experimental results illustrating the savings in communication that we can obtain.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

📈 Trend Setter — Multi-Agent Systems

🧭 Keyword Pioneer — communication policies

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy

🐣 Hot Topic Early Bird — partially observable markov decision process

Authors

João V. Messias , Matthijs Spaan , Pedro U. Lima

Topics

Artificial Intelligence > Core AI > Multi-Agent Systems Artificial Intelligence > Core AI > Planning Machine Learning > Optimization & Theory > Optimization Reinforcement Learning > Methods > Multi-Agent Systems Mathematics & Optimization > Optimization > Optimization Mathematics & Optimization > Optimization > Game Theory

Keywords

convex optimization communication policies partially observable markov decision process belief distribution decentralized planning multi-agent planning multi-agent pomdp factored dec-pomdp multiagent pomdp factored mdp multi-agent system communication policy

Download PDF

Related papers

Co-Training for Domain Adaptation 2011

The Local Rademacher Complexity of Lp-Norm Multiple Kernel Learning 2011

Learning to Agglomerate Superpixel Hierarchies 2011

A Reinforcement Learning Theory for Homeostatic Regulation 2011

A Global Structural EM Algorithm for a Model of Cancer Progression 2011