2026 AAAI AAAI 2026

Safe Multi-agent Reinforcement Learning with Natural Language Constraints

Abstract

Abstract Safe Multi-Agent Reinforcement Learning (MARL) typically relies on manually specified numeric cost functions to ensure that policy behaviours respect safety constraints. As systems scale and human-defined constraints become more diverse, context-dependent, and frequently updated, hand-crafting such cost functions becomes prohibitively complex, tedious, and error-prone. Natural language offers an intuitive and flexible alternative for defining constraints, enabling broader accessibility and easier adaptation to new scenarios and evolving rules. However, current MARL frameworks lack effective mechanisms to incorporate free-form textual constraints in a robust and principled way. To bridge this gap, we introduce Safe Multi-Agent Reinforcement Learning with natural Language constraints (SMALL), a framework that leverages fine-tuned language models to parse and encode textual constraints into semantically meaningful embeddings. These embeddings characterise prohibited states or behaviours and enable automatic prediction of constraint violations. We integrate the resulting learned costs directly into MARL training, allowing agents to optimise task performance while simultaneously minimising constraint violations, without requiring manually engineered numeric cost functions. To rigorously evaluate our method, we also propose the LaMaSafe benchmark---a set of diverse multi-agent tasks designed to assess the capability of MARL algorithms to understand and adhere to realistic, human-provided natural language constraints. Experimental results across LaMaSafe environments show that SMALL achieves comparable task performance to strong MARL baselines while significantly reducing constraint violations. While SMALL does not provide formal safety guarantees, it demonstrates that natural language can be used to shape multi-agent behaviour toward safer policies.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio