Adversarial Attention Deficit: Fooling Deformable Vision Transformers with Collaborative Adversarial Patches

Quazi Mishkatul Alam; Bilel Tarchoun; Ihsen Alouani; Nael Abu-Ghazaleh

2025 WACV WACV 2025

Adversarial Attention Deficit: Fooling Deformable Vision Transformers with Collaborative Adversarial Patches

Abstract

Deformable vision transformers reduce the expensive quadratic time-complexity of attention modeling by using sparse attention structures making it possible to use transformers in large-scale vision applications such as multi-view vision systems. We show that existing adversarial attacks against conventional vision transformers do not transfer to deformable transformers primarily due to the data-dependent dynamic nature of sparse attention. In this work we present for the first time adversarial attacks against deformable vision transformers by getting control of their attention-inferring module. We develop a novel collaborative attack where a source patch manipulates attention to point to a target patch containing the adversarial noise which fools the model. We observe that our attack alters less than 1% of the patched area in the input field completely disrupting object detection and resulting in 0% AP in single-view object detection using MS COCO and 0% MODA in multi-view object detection using Wildtrack.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Deep Learning and Machine Learning

🧭 Keyword Pioneer — deformable vision transformer

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Quazi Mishkatul Alam , Bilel Tarchoun , Ihsen Alouani , Nael Abu-Ghazaleh

Topics

Machine Learning > Learning Types > Adversarial Learning Computer Vision > Analysis > Object Detection Artificial Intelligence > Core AI > Computer Vision Deep Learning > Learning Types > Adversarial Learning Computer Vision > Core AI > Computer Vision

Keywords

vision transformer object detection attention mechanism adversarial attack deformable transformer adversarial patch deformable vision transformer

Download PDF

Related papers

Neural Graph Map: Dense Mapping with Efficient Loop Closure Integration 2025

ELMGS: Enhancing Memory and Computation Scalability through Compression for 3D Gaussian Splatting 2025

Feature Fusion Transferability Aware Transformer for Unsupervised Domain Adaptation 2025

Uncertainty-Aware Online Extrinsic Calibration: A Conformal Prediction Approach 2025

Disentangling Spatio-Temporal Knowledge for Weakly Supervised Object Detection and Segmentation in Surgical Video 2025