Sequential Order Adjustment of Action Decisions for Multi-Agent Transformer (Student Abstract)

Shota Takayama; Katsuhide Fujita

2025 AAAI AAAI 2025

Sequential Order Adjustment of Action Decisions for Multi-Agent Transformer (Student Abstract)

Abstract

Abstract Multi-agent reinforcement learning (MARL) trains multiple agents in shared environments. Recently, MARL models have significantly improved performance by leveraging sequential decision-making processes. Although these models can enhance performance, they do not explicitly con-sider the importance of the order in which agents make decisions. We propose AOAD-MAT, a novel model incorporating action decision sequence into learning. AOAD-MAT uses a Transformer-based actor-critic architecture to dynamically adjust agent action order. It introduces a subtask predicting the next agent to act, integrated into a PPO-based loss function. Experiments on StarCraft Multi-Agent Challenge and Multi-Agent MuJoCo benchmarks show AOAD-MAT out-performs existing models, demonstrating the effectiveness of adjusting agent order in MARL.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Reinforcement Learning

🧭 Keyword Pioneer — action order

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Shota Takayama , Katsuhide Fujita

Topics

Artificial Intelligence > Core AI > Multi-Agent Systems Deep Learning > Architectures > Transformers Reinforcement Learning > Methods > Multi-Agent Systems Machine Learning > Learning Types > Multi-Agent Systems Reinforcement Learning > Applications > Multi-Agent Systems

Keywords

transformer architecture multi-agent reinforcement learning sequential decision-making cooperative multi-agent agent coordination action order ppo algorithm action ordering

Download PDF

Related papers

BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving 2025

APIRL: Deep Reinforcement Learning for REST API Fuzzing 2025

Anywhere: A Multi-Agent Framework for User-Guided, Reliable, and Diverse Foreground-Conditioned Image Generation 2025

3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly Detection 2025

Collaborative Learning for 3D Hand-Object Reconstruction and Compositional Action Recognition from Egocentric RGB Videos Using Superquadrics 2025