2022 IJCAI IJCAI 2022

Multi-Constraint Deep Reinforcement Learning for Smooth Action Control

Abstract

Deep reinforcement learning (DRL) has been studied in a variety of challenging decision-making tasks, e.g., autonomous driving. \textcolor{black}{However, DRL typically suffers from the action shaking problem, which means that agents can select actions with big difference even though states only slightly differ.} One of the crucial reasons for this issue is the inappropriate design of the reward in DRL. In this paper, to address this issue, we propose a novel way to incorporate the smoothness of actions in the reward. Specifically, we introduce sub-rewards and add multiple constraints related to these sub-rewards. In addition, we propose a multi-constraint proximal policy optimization (MCPPO) method to solve the multi-constraint DRL problem. Extensive simulation results show that the proposed MCPPO method has better action smoothness compared with the traditional proportional-integral-differential (PID) and mainstream DRL algorithms. The video is available at https://youtu.be/F2jpaSm7YOg.

🧭 Keyword Pioneer β€” action smoothness
🐣 Hot Topic Early Bird β€” proximal policy optimization
🐝 Cross-Pollinator β€” Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy