Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Core AI
Artificial Intelligence
›
Core AI
›
Responsible AI
1991 directly classified papers
Papers per year
2011: 1
2016: 1
2017: 7
2018: 10
2019: 22
2020: 51
2021: 91
2022: 145
2023: 207
2024: 526
2025: 760
2026: 170
Papers
Private Set Generation with Discriminative Information
NIPS 2022
A Major Obstacle for NLP Research: Let’s Talk about Time Allocation!
EMNLP 2022
Debiasing Masks: A New Framework for Shortcut Mitigation in NLU
EMNLP 2022
Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits
NIPS 2022
QUARK: Controllable Text Generation with Reinforced Unlearning
NIPS 2022
Training language models to follow instructions with human feedback
NIPS 2022
ProsocialDialog: A Prosocial Backbone for Conversational Agents
EMNLP 2022
Delivering Trustworthy AI through Formal XAI
AAAI 2022
SkinCon: A skin disease dataset densely annotated by domain experts for fine-grained debugging and analysis
NIPS 2022
SafeText: A Benchmark for Exploring Physical Safety in Language Models
EMNLP 2022
Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models
NIPS 2022
Pile of Law: Learning Responsible Data Filtering from the Law and a 256GB Open-Source Legal Dataset
NIPS 2022
Rich Knowledge Sources Bring Complex Knowledge Conflicts: Recalibrating Models to Reflect Conflicting Evidence
EMNLP 2022
Gendered Mental Health Stigma in Masked Language Models
EMNLP 2022
Preparing High School Teachers to Integrate AI Methods into STEM Classrooms
AAAI 2022
DeepAuth: A DNN Authentication Framework by Model-Unique and Fragile Signature Embedding
AAAI 2022
ExSum: From Local Explanations to Model Understanding
NAACL 2022
Measure and Improve Robustness in NLP Models: A Survey
NAACL 2022
How Gender Debiasing Affects Internal Model Representations, and Why It Matters
NAACL 2022
Theory-Grounded Measurement of U.S. Social Stereotypes in English Language Models
NAACL 2022
Cross-Domain Detection of GPT-2-Generated Technical Text
NAACL 2022
Debiasing Pre-Trained Language Models via Efficient Fine-Tuning
ACL 2022
Measuring Harmful Sentence Completion in Language Models for LGBTQIA+ Individuals
ACL 2022
Pipelines for Social Bias Testing of Large Language Models
ACL 2022
Towards Automating Model Explanations with Certified Robustness Guarantees
AAAI 2022
<
1
…
69
70
71
…
80
>