Artificial Intelligence › Core AI ›

AI Safety

2972 directly classified papers

Papers per year

Papers

Pushing the Right Buttons: Adversarial Evaluation of Quality Estimation EMNLP 2021

Swiss-Judgment-Prediction: A Multilingual Legal Judgment Prediction Benchmark EMNLP 2021

Investigating Robustness of Dialog Models to Popular Figurative Language Constructs EMNLP 2021

Learning with Safety Constraints: Sample Complexity of Reinforcement Learning for Constrained MDPs AAAI 2021

Transfer Learning for Efficient Iterative Safety Validation AAAI 2021

Sample-Specific Output Constraints for Neural Networks AAAI 2021

SMT-based Safety Checking of Parameterized Multi-Agent Systems AAAI 2021

Initiative Defense against Facial Manipulation AAAI 2021

Probabilistic Robustness Quantification of Neural Networks AAAI 2021

Safety Assurance for Systems with Machine Learning Components AAAI 2021

Verification and Repair of Neural Networks AAAI 2021

Adversarial Training and Provable Robustness: A Tale of Two Objectives AAAI 2021

Fast Training of Provably Robust Neural Networks by SingleProp AAAI 2021

Deep Feature Space Trojan Attack of Neural Networks by Controlled Detoxification AAAI 2021

Model-Targeted Poisoning Attacks with Provable Convergence ICML 2021

Gaussian Process-Based Real-Time Learning for Safety Critical Applications ICML 2021

Query Complexity of Adversarial Attacks ICML 2021

Accelerating Safe Reinforcement Learning with Constraint-mismatched Baseline Policies ICML 2021

Safe Reinforcement Learning Using Advantage-Based Intervention ICML 2021

Backdoor Scanning for Deep Neural Networks through K-Arm Optimization ICML 2021

Multi-Expert Adversarial Attack Detection in Person Re-Identification Using Context Inconsistency ICCV 2021

Aha! Adaptive History-Driven Attack for Decision-Based Black-Box Models ICCV 2021

HIRE-SNN: Harnessing the Inherent Robustness of Energy-Efficient Deep Spiking Neural Networks by Training With Crafted Input Noise ICCV 2021

CLEAR: Clean-Up Sample-Targeted Backdoor in Neural Networks ICCV 2021

Defending Against Universal Adversarial Patches by Clipping Feature Norms ICCV 2021