Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation

Matthias Hein; Maksym Andriushchenko

2017 NIPS NeurIPS 2017

Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation

Abstract

Recent work has shown that state-of-the-art classifiers are quite brittle, in the sense that a small adversarial change of an originally with high confidence correctly classified input leads to a wrong classification again with high confidence. This raises concerns that such classifiers are vulnerable to attacks and calls into question their usage in safety-critical systems. We show in this paper for the first time formal guarantees on the robustness of a classifier by giving instance-specific \emph{lower bounds} on the norm of the input manipulation required to change the classifier decision. Based on this analysis we propose the Cross-Lipschitz regularization functional. We show that using this form of regularization in kernel methods resp. neural networks improves the robustness of the classifier without any loss in prediction performance.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

📈 Trend Setter — Adversarial Learning

🧭 Keyword Pioneer — formal guarantee

🐣 Hot Topic Early Bird — adversarial robustness

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Matthias Hein , Maksym Andriushchenko

Topics

Artificial Intelligence > Core AI > AI Safety Machine Learning > Core Methods > Classification Machine Learning > Learning Types > Adversarial Learning Artificial Intelligence > Core AI > Adversarial Learning Artificial Intelligence > Core AI > Safety

Keywords

adversarial learning adversarial robustness formal verification formal guarantee classifier robustness adversarial manipulation cross-lipschitz regularization

Download PDF

Related papers

High-Order Attention Models for Visual Question Answering 2017

Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Optimization 2017

Premise Selection for Theorem Proving by Deep Graph Embedding 2017

Neural Program Meta-Induction 2017

Safe and Nested Subgame Solving for Imperfect-Information Games 2017