Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Core AI
Artificial Intelligence
›
Core AI
›
AI Safety
2972 directly classified papers
Papers per year
2002: 1
2006: 1
2007: 1
2012: 4
2013: 1
2015: 5
2016: 1
2017: 13
2018: 40
2019: 91
2020: 111
2021: 181
2022: 204
2023: 333
2024: 642
2025: 1031
2026: 312
Papers
Learn2Perturb: An End-to-End Feature Perturbation Learning to Improve Adversarial Robustness
CVPR 2020
Algorithmic recourse under imperfect causal knowledge: a probabilistic approach
NIPS 2020
Recovery of sparse linear classifiers from mixture of responses
NIPS 2020
Consistency Regularization for Certified Robustness of Smoothed Classifiers
NIPS 2020
Certifying Confidence via Randomized Smoothing
NIPS 2020
Certifiably Adversarially Robust Detection of Out-of-Distribution Data
NIPS 2020
Robustness of Bayesian Neural Networks to Gradient-Based Attacks
NIPS 2020
MetaPoison: Practical General-purpose Clean-label Data Poisoning
NIPS 2020
On the Trade-off between Adversarial and Backdoor Robustness
NIPS 2020
GreedyFool: Distortion-Aware Sparse Adversarial Attack
NIPS 2020
GNNGuard: Defending Graph Neural Networks against Adversarial Attacks
NIPS 2020
On Adaptive Attacks to Adversarial Example Defenses
NIPS 2020
Automatic Perturbation Analysis for Scalable Certified Robustness and Beyond
NIPS 2020
Backpropagating Linearly Improves Transferability of Adversarial Examples
NIPS 2020
Uncertainty-Aware Constraint Learning for Adaptive Safe Motion Planning from Demonstrations
CORL 2020
Neurosymbolic Reinforcement Learning with Formally Verified Exploration
NIPS 2020
Efficient Exploration of Reward Functions in Inverse Reinforcement Learning via Bayesian Optimization
NIPS 2020
Higher-Order Certification For Randomized Smoothing
NIPS 2020
Learning Black-Box Attackers with Transferable Priors and Query Feedback
NIPS 2020
Give Me Convenience and Give Her Death: Who Should Decide What Uses of NLP are Appropriate, and on What Basis?
ACL 2020
Adversarial Weight Perturbation Helps Robust Generalization
NIPS 2020
BullStop: A Mobile App for Cyberbullying Prevention
COLING 2020
Avoiding Side Effects By Considering Future Tasks
NIPS 2020
Consequences of Misaligned AI
NIPS 2020
A Gamified Assessment Platform for Predicting the Risk of Dementia +Parkinson’s disease (DPD) Co-Morbidity
IJCAI 2020
<
1
…
111
112
113
…
119
>