Topological Detection of Trojaned Neural Networks

Songzhu Zheng; Yikai Zhang; Hubert Wagner; Mayank Goswami; Chao Chen

2021 NIPS NeurIPS 2021

Topological Detection of Trojaned Neural Networks

Abstract

Deep neural networks are known to have security issues. One particular threat is the Trojan attack. It occurs when the attackers stealthily manipulate the model's behavior through Trojaned training samples, which can later be exploited. Guided by basic neuroscientific principles, we discover subtle -- yet critical -- structural deviation characterizing Trojaned models. In our analysis we use topological tools. They allow us to model high-order dependencies in the networks, robustly compare different networks, and localize structural abnormalities. One interesting observation is that Trojaned models develop short-cuts from shallow to deep layers. Inspired by these observations, we devise a strategy for robust detection of Trojaned models. Compared to standard baselines it displays better performance on multiple benchmarks.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning

🧭 Keyword Pioneer — trojan attack detection

🐣 Hot Topic Early Bird — model security

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Songzhu Zheng , Yikai Zhang , Hubert Wagner , Mayank Goswami , Chao Chen

Topics

Artificial Intelligence > Core AI > AI Safety Artificial Intelligence > Core AI > Model Compression Deep Learning > Optimization & Theory > Theory Artificial Intelligence > Core AI > Safety

Keywords

model security adversarial machine learning topological data analysis trojan attack detection neural network trojan attack structural analysis

Download PDF

Related papers

Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data 2021

On Model Calibration for Long-Tailed Object Detection and Instance Segmentation 2021

Test-Time Personalization with a Transformer for Human Pose Estimation 2021

NTopo: Mesh-free Topology Optimization using Implicit Neural Representations 2021

Scalable Intervention Target Estimation in Linear Models 2021