SEER: Backdoor Detection for Vision-Language Models through Searching Target Text and Image Trigger Jointly

Liuwan Zhu; Rui Ning; Jiang Li; Chunsheng Xin; Hongyi Wu

2024 AAAI AAAI 2024

SEER: Backdoor Detection for Vision-Language Models through Searching Target Text and Image Trigger Jointly

Abstract

Abstract This paper proposes SEER, a novel backdoor detection algorithm for vision-language models, addressing the gap in the literature on multi-modal backdoor detection. While backdoor detection in single-modal models has been well studied, the investigation of such defenses in multi-modal models remains limited. Existing backdoor defense mechanisms cannot be directly applied to multi-modal settings due to their increased complexity and search space explosion. In this paper, we propose to detect backdoors in vision-language models by jointly searching image triggers and malicious target texts in feature space shared by vision and language modalities. Our extensive experiments demonstrate that SEER can achieve over 92% detection rate on backdoor detection in vision-language models in various settings without accessing training data or knowledge of downstream tasks.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning

🧭 Keyword Pioneer — multi-modal defense

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Liuwan Zhu , Rui Ning , Jiang Li , Chunsheng Xin , Hongyi Wu

Topics

Artificial Intelligence > Core AI > AI Safety Artificial Intelligence > Core AI > Model Compression Artificial Intelligence > Core AI > Adversarial Learning Deep Learning > Learning Types > Adversarial Learning Artificial Intelligence > Core AI > Multi-Modal Learning

Keywords

multi-modal learning feature space vision-language model backdoor detection trigger detection multi-modal defense

Download PDF

Related papers

Goal Alignment: Re-analyzing Value Alignment Problems Using Human-Aware AI 2024

Meta-Inverse Reinforcement Learning for Mean Field Games via Probabilistic Context Variables 2024

Suppressing Uncertainty in Gaze Estimation 2024

Mask-Homo: Pseudo Plane Mask-Guided Unsupervised Multi-Homography Estimation 2024

Heterogeneous Test-Time Training for Multi-Modal Person Re-identification 2024