Identifying untrustworthy predictions in neural networks by geometric gradient analysis

Leo Schwinn; An Nguyen; René Raab; Leon Bungert; Daniel Tenbrinck; Dario Zanca; Martin Bürger; Bjoern Eskofier

2021 UAI UAI 2021

Identifying untrustworthy predictions in neural networks by geometric gradient analysis

Abstract

The susceptibility of deep neural networks to untrustworthy predictions, including out-of-distribution (OOD) data and adversarial examples, still prevent their widespread use in safety-critical applications. Most existing methods either require a retraining of a given model to achieve robust identification of adversarial attacks or are limited to out-of-distribution sample detection only. In this work, we propose a geometric gradient analysis (GGA) to improve the identification of untrustworthy predictions without retraining of a given model. GGA analyzes the geometry of the loss landscape of neural networks based on the saliency maps of their respective input. We observe considerable differences between the input gradient geometry of trustworthy and untrustworthy predictions. Using these differences, GGA outperforms prior approaches in detecting OOD data and adversarial attacks, including state-of-the-art and adaptive attacks.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning

🧭 Keyword Pioneer — geometric gradient analysis

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Leo Schwinn , An Nguyen , René Raab , Leon Bungert , Daniel Tenbrinck , Dario Zanca , Martin Bürger , Bjoern Eskofier

Topics

Artificial Intelligence > Core AI > Interpretability Deep Learning > Architectures > Neural Networks

Keywords

loss landscape saliency map out-of-distribution detection adversarial attack detection geometric gradient analysis

Download PDF

Related papers

Efficient greedy coordinate descent via variable partitioning 2021

Multi-output Gaussian Processes for uncertainty-aware recommender systems 2021

Constrained differentially private federated learning for low-bandwidth devices 2021

Matrix games with bandit feedback 2021

A weaker faithfulness assumption based on triple interactions 2021