2025 ICCV ICCV 2025

Prototype Guided Backdoor Defense via Activation Space Manipulation

Abstract

Deep learning models are susceptible to backdoor attacks involving malicious perturbation of some training data with a trigger to force misclassification to a target class. Various triggers have been used including semantic triggers that are easily realizable. We present Prototype Guided Backdoor Defense (PGBD), a robust post-hoc defense that scales across different trigger types, including previously unsolved semantic triggers. PPGBD exploits displacements in the geometric spaces of activations to penalize movements towards the trigger. This is done using a novel sanitization loss of a post-hoc fine-tuning step. This approach scales to all types of attacks and triggers, and achieves better performance across settings. We also present the first defense against semantic attacks on a new celebrity face images dataset. Activation spaces can provide rich clues to enhance DL models in different ways.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning
🧭 Keyword Pioneer — semantic trigger
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio