Combining MORL with Restraining Bolts to Learn Normative Behaviour

Emery A. Neufeld; Agata Ciabattoni; Radu Florin Tulcan

2025 IJCAI IJCAI 2025

Combining MORL with Restraining Bolts to Learn Normative Behaviour

Abstract

Normative Restraining Bolts (NRBs) adapt the restraining bolt technique (originally developed for safe reinforcement learning) to ensure compliance with social, legal, and ethical norms. While effective, NRBs rely on trial-and-error weight tuning, which hinders their ability to enforce hierarchical norms; moreover, norm updates require retraining. In this paper, we reformulate learning with NRBs as a multi-objective reinforcement learning (MORL) problem, where each norm is treated as a distinct objective. This enables the introduction of Ordered Normative Restraining Bolts (ONRBs), which support algorithmic weight selection, prioritized norms, norm updates, and provide formal guarantees on minimizing norm violations. Case studies show that ONRBs offer a robust and principled foundation for RL-agents to comply with a wide range of norms while achieving their goals.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Reinforcement Learning

🧭 Keyword Pioneer — restraining bolt

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy

Authors

Emery A. Neufeld , Agata Ciabattoni , Radu Florin Tulcan

Topics

Artificial Intelligence > Core AI > AI Safety Reinforcement Learning > Methods > Deep RL

Keywords

ai safety multi-objective reinforcement learning norm compliance restraining bolt normative behavior

Download PDF

Related papers

Learning Advanced Self-Attention for Linear Transformers in the Singular Value Domain 2025

Responsibility Anticipation and Attribution in LTLf 2025

Argument-based Multi-Issue Negotiation 2025

Online Resource Sharing: Better Robust Guarantees via Randomized Strategies 2025

Equitable Mechanism Design for Facility Location 2025