2026 AAAI AAAI 2026

TarPro: Targeted Protection Against Malicious Image Editing

Abstract

Abstract The rapid advancement of diffusion-based image editing has enabled highly controllable visual content generation but has also raised serious concerns about the misuse of generative models for producing Not-Safe-for-Work (NSFW) content. Existing protection strategies inject adversarial perturbations to disrupt editing. However, these methods are untargeted, often degrading benign edits while failing to eliminate harmful outputs. In this work, we propose TarPro, a targeted protection framework that blocks malicious edits while preserving benign editing functionality. TarPro introduces Dual-Intent Optimization (DIO), a semantic alignment objective that suppresses malicious prompt effects while retaining desirable, benign edits, by leveraging prompt compositionality rather than requiring manually annotated preferences. To ensure robustness and generalization, we replace pixel-level optimization with a generator-based perturbation learning strategy that learns to produce structured, imperceptible perturbations in parameter space. Experiments on multiple diffusion backbones show that TarPro significantly blocks NSFW content while maintaining high-quality benign edits, outperforming strong baselines through both qualitative and quantitative evaluations.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning
🧭 Keyword Pioneer — image editing protection
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio