Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Core AI
Artificial Intelligence
›
Core AI
›
AI Safety
2972 directly classified papers
Papers per year
2002: 1
2006: 1
2007: 1
2012: 4
2013: 1
2015: 5
2016: 1
2017: 13
2018: 40
2019: 91
2020: 111
2021: 181
2022: 204
2023: 333
2024: 642
2025: 1031
2026: 312
Papers
MTAttack: Multi-Target Backdoor Attacks Against Large Vision-Language Models
AAAI 2026
FRBAT: Conditionally-Visible Physical Backdoor Attack via Fluorescence
AAAI 2026
Advancing Out-of-Distribution Detection Across Diverse Scenarios
AAAI 2026
Creating Blank Canvas Against AI-enabled Image Forgery
AAAI 2026
Certified but Fooled! Breaking Certified Defenses with Ghost Certificates
AAAI 2026
On Trustworthy, Explainable, and Verifiable High-Level Autonomy via Hierarchical Planning
AAAI 2026
Persistent Instability in LLM’s Personality Measurements: Effects of Scale, Reasoning, and Conversation History
AAAI 2026
Characterizing AI Manipulation Risks in Brazilian YouTube Climate Discourse
AAAI 2026
Evaluating LLMs for Police Decision-Making: A Framework Based on Police Action Scenarios
AAAI 2026
Can Editing LLMs Inject Harm?
AAAI 2026
The Emotional Baby Is Truly Deadly: Does Your Multimodal Large Reasoning Model Have Emotional Flattery Towards Humans?
AAAI 2026
PhysPatch: A Physically Realizable and Transferable Adversarial Patch Attack for Multimodal Large Language Models-based Autonomous Driving Systems
AAAI 2026
Diversifying Counterattacks: Orthogonal Exploration for Robust CLlP Inference
AAAI 2026
Selective Weak-to-Strong Generalization
AAAI 2026
VILTA: A VLM-in-the-Loop Adversary for Enhancing Driving Policy Robustness
AAAI 2026
Clean-Label Physical Backdoor Attacks with Data Distillation
AAAI 2026
EigenShield: Inference-Time, Model-Agnostic Jailbreaking Defense via Causal Subspace Filtering
AAAI 2026
Disentangling Adversarial Prompts: A Semantic-Graph Defense for Robust LLM Security
AAAI 2026
Where to Start Alignment? Diffusion Large Language Model May Demand a Distinct Position
AAAI 2026
An LLM-based Quantitative Framework for Evaluating High-Stealthy Backdoor Risks in OSS Supply Chains
AAAI 2026
Authority Backdoor: A Certifiable Backdoor Mechanism for Authoring DNNs
AAAI 2026
HogVul: Black-box Adversarial Code Generation Framework Against LM-based Vulnerability Detectors
AAAI 2026
Differentiated Directional Intervention: A Framework for Evading LLM Safety Alignment
AAAI 2026
MovieGraph-ToM: Evaluating Long-Range Theory of Mind in Large Language Models via Implicit Social-Causal Graphs
AAAI 2026
Modulation-Based Backdoors: Leveraging Amplitude and Frequency Patterns to Attack Speaker Recognition
AAAI 2026
<
1
…
5
6
7
…
119
>