NAT: Learning to Attack Neurons for Enhanced Adversarial Transferability

Krishna kanth Nakka; Alexandre Alahi

2025 WACV WACV 2025

NAT: Learning to Attack Neurons for Enhanced Adversarial Transferability

Abstract

The generation of transferable adversarial perturbations typically involves training a generator to maximize embedding separation between clean and adversarial images at a single mid-layer of a source model. In this work we build on this approach and introduce Neuron Attack for Transferability (NAT) a method designed to target specific neuron within the embedding. Our approach is motivated by the observation that previous layer-level optimizations often disproportionately focus on a few neurons representing similar concepts leaving other neurons within the attacked layer minimally affected. NAT shifts the focus from embedding-level separation to a more fundamental neuron-specific approach. We find that targeting individual neurons effectively disrupts the core units of the neural network providing a common basis for transferability across different models. Through extensive experiments on 41 diverse ImageNet models and 9 fine-grained models NAT achieves fooling rates that surpass existing baselines by over 14% in cross-model and 4% in cross-domain settings. Furthermore by leveraging the complementary attacking capabilities of the trained generators we achieve impressive fooling rates within just 10 queries. Our code is available at: https://krishnakanthnakka.github.io/NAT/

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🧭 Keyword Pioneer — embedding separation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Krishna kanth Nakka , Alexandre Alahi

Topics

Machine Learning > Core Methods > Embedding Learning Machine Learning > Learning Types > Adversarial Learning Deep Learning > Architectures > Neural Networks

Keywords

adversarial perturbation adversarial transferability fooling rate neural network embedding separation

Download PDF

Related papers

Neural Graph Map: Dense Mapping with Efficient Loop Closure Integration 2025

ELMGS: Enhancing Memory and Computation Scalability through Compression for 3D Gaussian Splatting 2025

Feature Fusion Transferability Aware Transformer for Unsupervised Domain Adaptation 2025

Uncertainty-Aware Online Extrinsic Calibration: A Conformal Prediction Approach 2025

Disentangling Spatio-Temporal Knowledge for Weakly Supervised Object Detection and Segmentation in Surgical Video 2025