Artificial Intelligence › Core AI ›

Adversarial Learning

1235 directly classified papers

Papers per year

Papers

When Visual State Space Model Meets Backdoor Attacks WACV 2025

Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing CVPR 2025

Data-free Universal Adversarial Perturbation with Pseudo-semantic Prior CVPR 2025

Power of Diversity: Enhancing Data-Free Black-Box Attack with Domain-Augmented Learning AAAI 2025

Silence is Golden: Leveraging Adversarial Examples to Nullify Audio Control in LDM-based Talking-Head Generation CVPR 2025

Divide and Conquer: Heterogeneous Noise Integration for Diffusion-based Adversarial Purification CVPR 2025

Mind the Trojan Horse: Image Prompt Adapter Enabling Scalable and Deceptive Jailbreaking CVPR 2025

Mind the Gap: Detecting Black-box Adversarial Attacks in the Making through Query Update Analysis CVPR 2025

RAEncoder: A Label-Free Reversible Adversarial Examples Encoder for Dataset Intellectual Property Protection CVPR 2025

Why Safeguarded Ships Run Aground? Aligned Large Language Models’ Safety Mechanisms Tend to Be Anchored in The Template Region ACL 2025

Sheep’s Skin, Wolf’s Deeds: Are LLMs Ready for Metaphorical Implicit Hate Speech? ACL 2025

Defense Against Prompt Injection Attack by Leveraging Attack Techniques ACL 2025

LLMs know their vulnerabilities: Uncover Safety Gaps through Natural Distribution Shifts ACL 2025

Can LLMs Deceive CLIP? Benchmarking Adversarial Compositionality of Pre-trained Multimodal Representation via Text Updates ACL 2025

Guardrails and Security for LLMs: Safe, Secure and Controllable Steering of LLM Applications ACL 2025

Efficient but Vulnerable: Benchmarking and Defending LLM Batch Prompting Attack ACL 2025

RedHit: Adaptive Red-Teaming of Large Language Models via Search, Reasoning, and Preference Optimization ACL 2025

Using Humor to Bypass Safety Guardrails in Large Language Models ACL 2025

Bypassing LLM Guardrails: An Empirical Analysis of Evasion Attacks against Prompt Injection and Jailbreak Detection Systems ACL 2025

Safe in Isolation, Dangerous Together: Agent-Driven Multi-Turn Decomposition Jailbreaks on LLMs ACL 2025

Graph of Attacks with Pruning: Optimizing Stealthy Jailbreak Prompt. Generation for Enhanced LLM Content Moderation ACL 2025

Improving Transferable Targeted Attacks with Feature Tuning Mixup CVPR 2025

SUA: Stealthy Multimodal Large Language Model Unlearning Attack EMNLP 2025

Rethinking the Adversarial Robustness of Multi-Exit Neural Networks in an Attack-Defense Game CVPR 2025

From Allies to Adversaries: Manipulating LLM Tool-Calling through Adversarial Injection NAACL 2025