Artificial Intelligence › Core AI ›

Adversarial Learning

1235 directly classified papers

Papers per year

Papers

SilverSpeak: Evading AI-Generated Text Detectors using Homoglyphs COLING 2025

Integrating Argumentation Features for Enhanced Propaganda Detection in Arabic Narratives on the Israeli War on Gaza COLING 2025

DAMAGE: Detecting Adversarially Modified AI Generated Text COLING 2025

Mitigating Catastrophic Overfitting in Fast Adversarial Training via Label Information Elimination ICCV 2025

Prompt-driven Transferable Adversarial Attack on Person Re-Identification with Attribute-aware Textual Inversion ICCV 2025

IDEATOR: Jailbreaking and Benchmarking Large Vision-Language Models Using Themselves ICCV 2025

Backdoor Attacks on Neural Networks via One-Bit Flip ICCV 2025

Temporal Unlearnable Examples: Preventing Personal Video Data from Unauthorized Exploitation by Object Tracking ICCV 2025

Power of Diversity: Enhancing Data-Free Black-Box Attack with Domain-Augmented Learning AAAI 2025

From Allies to Adversaries: Manipulating LLM Tool-Calling through Adversarial Injection NAACL 2025

AutoBreach: Universal and Adaptive Jailbreaking with Efficient Wordplay-Guided Optimization via Multi-LLMs NAACL 2025

SMP-Attack: Boosting the Transferability of Feature Importance-based Adversarial Attack with Semantics-aware Multi-granularity Patchout ICCV 2025

Stealthy Jailbreak Attacks on Large Language Models via Benign Data Mirroring NAACL 2025

ODDR: Outlier Detection & Dimension Reduction Based Defense Against Adversarial Patches ICCV 2025

SeqAR: Jailbreak LLMs with Sequential Auto-Generated Characters NAACL 2025

Boosting Generative Adversarial Transferability with Self-supervised Vision Transformer Features ICCV 2025

Atoxia: Red-teaming Large Language Models with Target Toxic Answers NAACL 2025

Vulnerability of Large Language Models to Output Prefix Jailbreaks: Impact of Positions on Safety NAACL 2025

ZIUM: Zero-Shot Intent-Aware Adversarial Attack on Unlearned Models ICCV 2025

Boosting Adversarial Transferability via Negative Hessian Trace Regularization ICCV 2025

Why Safeguarded Ships Run Aground? Aligned Large Language Models’ Safety Mechanisms Tend to Be Anchored in The Template Region ACL 2025

Enhancing Adversarial Transferability with Adversarial Weight Tuning AAAI 2025

Data-free Universal Adversarial Perturbation with Pseudo-semantic Prior CVPR 2025

Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing CVPR 2025

Efficient Image-to-Image Diffusion Classifier for Adversarial Robustness AAAI 2025