Know Your Enemy: Identifying Adversarial Behaviours in Deep Reinforcement Learning Agents (Student Abstract)

Seán Caulfield Curley; Karl Mason; Patrick Mannion

2023 AAAI AAAI 2023

Know Your Enemy: Identifying Adversarial Behaviours in Deep Reinforcement Learning Agents (Student Abstract)

Abstract

Abstract It has been shown that an agent can be trained with an adversarial policy which achieves high degrees of success against a state-of-the-art DRL victim despite taking unintuitive actions. This prompts the question: is this adversarial behaviour detectable through the observations of the victim alone? We find that widely used classification methods such as random forests are only able to achieve a maximum of ≈71% test set accuracy when classifying an agent for a single timestep. However, when the classifier inputs are treated as time-series data, test set classification accuracy is increased significantly to ≈98%. This is true for both classification of episodes as a whole, and for “live” classification at each timestep in an episode. These classifications can then be used to “react” to incoming attacks and increase the overall win rate against Adversarial opponents by approximately 17%. Classification of the victim’s own internal activations in response to the adversary is shown to achieve similarly impressive accuracy while also offering advantages like increased transferability to other domains.

🌉 Interdisciplinary Bridge — Machine Learning and Reinforcement Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Seán Caulfield Curley , Karl Mason , Patrick Mannion

Topics

Machine Learning > Learning Types > Adversarial Learning Reinforcement Learning > Methods > Deep RL

Keywords

deep reinforcement learning adversarial learning time-series classification random forest

Download PDF

Related papers

A Model-Agnostic Heuristics for Selective Classification 2023

Tackling Safe and Efficient Multi-Agent Reinforcement Learning via Dynamic Shielding (Student Abstract) 2023

Head-Free Lightweight Semantic Segmentation with Linear Transformer 2023

Hierarchical ConViT with Attention-Based Relational Reasoner for Visual Analogical Reasoning 2023

Deep Spiking Neural Networks with High Representation Similarity Model Visual Pathways of Macaque and Mouse 2023