2023
NIPS
NeurIPS 2023
Robustness Guarantees for Adversarially Trained Neural Networks
Abstract
We study robust adversarial training of two-layer neural networks as a bi-level optimization problem. In particular, for the inner loop that implements the adversarial attack during training using projected gradient descent (PGD), we propose maximizing a \emph{lower bound} on the $0/1$-loss by reflecting a surrogate loss about the origin. This allows us to give a convergence guarantee for the inner-loop PGD attack. Furthermore, assuming the data is linearly separable, we provide precise iteration complexity results for end-to-end adversarial training, which holds for any width and initialization. We provide empirical evidence to support our theoretical results.
🌉
Interdisciplinary Bridge
— Deep Learning and Machine Learning
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Topics
Machine Learning > Learning Types > Adversarial Learning
Machine Learning > Optimization & Theory > Optimization
Deep Learning > Architectures > Neural Networks
Deep Learning > Optimization & Theory > Optimization
Deep Learning > Learning Types > Adversarial Learning
Deep Learning > Learning Types > Robustness