Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Core AI
Artificial Intelligence
›
Core AI
›
AI Safety
2972 directly classified papers
Papers per year
2002: 1
2006: 1
2007: 1
2012: 4
2013: 1
2015: 5
2016: 1
2017: 13
2018: 40
2019: 91
2020: 111
2021: 181
2022: 204
2023: 333
2024: 642
2025: 1031
2026: 312
Papers
Pushing the Right Buttons: Adversarial Evaluation of Quality Estimation
EMNLP 2021
Swiss-Judgment-Prediction: A Multilingual Legal Judgment Prediction Benchmark
EMNLP 2021
Investigating Robustness of Dialog Models to Popular Figurative Language Constructs
EMNLP 2021
Learning with Safety Constraints: Sample Complexity of Reinforcement Learning for Constrained MDPs
AAAI 2021
Transfer Learning for Efficient Iterative Safety Validation
AAAI 2021
Sample-Specific Output Constraints for Neural Networks
AAAI 2021
SMT-based Safety Checking of Parameterized Multi-Agent Systems
AAAI 2021
Initiative Defense against Facial Manipulation
AAAI 2021
Probabilistic Robustness Quantification of Neural Networks
AAAI 2021
Safety Assurance for Systems with Machine Learning Components
AAAI 2021
Verification and Repair of Neural Networks
AAAI 2021
Adversarial Training and Provable Robustness: A Tale of Two Objectives
AAAI 2021
Fast Training of Provably Robust Neural Networks by SingleProp
AAAI 2021
Deep Feature Space Trojan Attack of Neural Networks by Controlled Detoxification
AAAI 2021
Model-Targeted Poisoning Attacks with Provable Convergence
ICML 2021
Gaussian Process-Based Real-Time Learning for Safety Critical Applications
ICML 2021
Query Complexity of Adversarial Attacks
ICML 2021
Accelerating Safe Reinforcement Learning with Constraint-mismatched Baseline Policies
ICML 2021
Safe Reinforcement Learning Using Advantage-Based Intervention
ICML 2021
Backdoor Scanning for Deep Neural Networks through K-Arm Optimization
ICML 2021
Multi-Expert Adversarial Attack Detection in Person Re-Identification Using Context Inconsistency
ICCV 2021
Aha! Adaptive History-Driven Attack for Decision-Based Black-Box Models
ICCV 2021
HIRE-SNN: Harnessing the Inherent Robustness of Energy-Efficient Deep Spiking Neural Networks by Training With Crafted Input Noise
ICCV 2021
CLEAR: Clean-Up Sample-Targeted Backdoor in Neural Networks
ICCV 2021
Defending Against Universal Adversarial Patches by Clipping Feature Norms
ICCV 2021
<
1
…
106
107
108
…
119
>