Sequential Order Adjustment of Action Decisions for Multi-Agent Transformer (Student Abstract)
Abstract
Abstract Multi-agent reinforcement learning (MARL) trains multiple agents in shared environments. Recently, MARL models have significantly improved performance by leveraging sequential decision-making processes. Although these models can enhance performance, they do not explicitly con-sider the importance of the order in which agents make decisions. We propose AOAD-MAT, a novel model incorporating action decision sequence into learning. AOAD-MAT uses a Transformer-based actor-critic architecture to dynamically adjust agent action order. It introduces a subtask predicting the next agent to act, integrated into a PPO-based loss function. Experiments on StarCraft Multi-Agent Challenge and Multi-Agent MuJoCo benchmarks show AOAD-MAT out-performs existing models, demonstrating the effectiveness of adjusting agent order in MARL.