Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Core AI
Artificial Intelligence
›
Core AI
›
Adversarial Learning
1235 directly classified papers
Papers per year
2009: 1
2010: 1
2011: 1
2013: 1
2014: 1
2016: 1
2017: 7
2018: 35
2019: 86
2020: 130
2021: 166
2022: 188
2023: 166
2024: 185
2025: 264
2026: 2
Papers
Unnoticed Yet Effective: A Hybrid Physical Camouflage Framework Against DNNs and Human Perception
AAAI 2026
Mitigating Backdoor Attacks via Trigger Reconstruction and Model Hardening
WACV 2026
SUA: Stealthy Multimodal Large Language Model Unlearning Attack
EMNLP 2025
TombRaider: Entering the Vault of History to Jailbreak Large Language Models
EMNLP 2025
RedHit: Adaptive Red-Teaming of Large Language Models via Search, Reasoning, and Preference Optimization
ACL 2025
Visual Contextual Attack: Jailbreaking MLLMs with Image-Driven Context Injection
EMNLP 2025
Safe in Isolation, Dangerous Together: Agent-Driven Multi-Turn Decomposition Jailbreaks on LLMs
ACL 2025
Evading Toxicity Detection with ASCII-art: A Benchmark of Spatial Attacks on Moderation Systems
ACL 2025
LLMs know their vulnerabilities: Uncover Safety Gaps through Natural Distribution Shifts
ACL 2025
Defense Against Prompt Injection Attack by Leveraging Attack Techniques
ACL 2025
Efficient but Vulnerable: Benchmarking and Defending LLM Batch Prompting Attack
ACL 2025
Attention Eclipse: Manipulating Attention to Bypass LLM Safety-Alignment
EMNLP 2025
Your Language Model Can Secretly Write Like Humans: Contrastive Paraphrase Attacks on LLM-Generated Text Detectors
EMNLP 2025
NullSwap: Proactive Identity Cloaking Against Deepfake Face Swapping
ICCV 2025
FaceShield: Defending Facial Image against Deepfake Threats
ICCV 2025
Can LLMs Deceive CLIP? Benchmarking Adversarial Compositionality of Pre-trained Multimodal Representation via Text Updates
ACL 2025
Using Humor to Bypass Safety Guardrails in Large Language Models
ACL 2025
Bypassing LLM Guardrails: An Empirical Analysis of Evasion Attacks against Prompt Injection and Jailbreak Detection Systems
ACL 2025
Power of Diversity: Enhancing Data-Free Black-Box Attack with Domain-Augmented Learning
AAAI 2025
Why Safeguarded Ships Run Aground? Aligned Large Language Models’ Safety Mechanisms Tend to Be Anchored in The Template Region
ACL 2025
Cross-Modal Stealth: A Coarse-to-Fine Attack Framework for RGB-T Tracker
AAAI 2025
Sheep’s Skin, Wolf’s Deeds: Are LLMs Ready for Metaphorical Implicit Hate Speech?
ACL 2025
Bridging Robustness and Generalization Against Word Substitution Attacks in NLP via the Growth Bound Matrix Approach
ACL 2025
Guardrails and Security for LLMs: Safe, Secure and Controllable Steering of LLM Applications
ACL 2025
Graph of Attacks with Pruning: Optimizing Stealthy Jailbreak Prompt. Generation for Enhanced LLM Content Moderation
ACL 2025
<
1
2
3
4
5
…
50
>