Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Core AI
Artificial Intelligence
›
Core AI
›
Responsible AI
1991 directly classified papers
Papers per year
2011: 1
2016: 1
2017: 7
2018: 10
2019: 22
2020: 51
2021: 91
2022: 145
2023: 207
2024: 526
2025: 760
2026: 170
Papers
Artificial Impressions: Evaluating Large Language Model Behavior Through the Lens of Trait Impressions
EMNLP 2025
Path Drift in Large Reasoning Models: How First-Person Commitments Override Safety
EMNLP 2025
CritiQ: Mining Data Quality Criteria from Human Preferences
ACL 2025
FairI Tales: Evaluation of Fairness in Indian Contexts with a Focus on Bias and Stereotypes
ACL 2025
MolErr2Fix: Benchmarking LLM Trustworthiness in Chemistry via Modular Error Detection, Localization, Explanation, and Correction
EMNLP 2025
HumT DumT: Measuring and controlling human-like language in LLMs
ACL 2025
Multilingual Text-to-Image Generation Magnifies Gender Stereotypes
ACL 2025
LLMs as Medical Safety Judges: Evaluating Alignment with Human Annotation in Patient-Facing QA
ACL 2025
Shared Path: Unraveling Memorization in Multilingual LLMs through Language Similarities
EMNLP 2025
Who Holds the Pen? Caricature and Perspective in LLM Retellings of History
EMNLP 2025
K/DA: Automated Data Generation Pipeline for Detoxifying Implicitly Offensive Language in Korean
ACL 2025
Unintended Harms of Value-Aligned LLMs: Psychological and Empirical Insights
ACL 2025
From Surveys to Narratives: Rethinking Cultural Value Adaptation in LLMs
EMNLP 2025
Only a Little to the Left: A Theory-grounded Measure of Political Bias in Large Language Models
ACL 2025
SDD: Self-Degraded Defense against Malicious Fine-tuning
ACL 2025
Surface Fairness, Deep Bias: A Comparative Study of Bias in Language Models
ACL 2025
Iterative Prompt Refinement for Safer Text-to-Image Generation
EMNLP 2025
Who Writes What: Unveiling the Impact of Author Roles on AI-generated Text Detection
ACL 2025
Dehumanizing Machines: Mitigating Anthropomorphic Behaviors in Text Generation Systems
ACL 2025
Adapting Psycholinguistic Research for LLMs: Gender-inclusive Language in a Coreference Context
ACL 2025
Safety Alignment via Constrained Knowledge Unlearning
ACL 2025
SEA: Low-Resource Safety Alignment for Multimodal Large Language Models via Synthetic Embeddings
ACL 2025
From Imitation to Innovation: The Emergence of AI's Unique Artistic Styles and the Challenge of Copyright Protection
ICCV 2025
Less Is More? Examining Fairness in Pruned Large Language Models for Summarising Opinions
EMNLP 2025
STEER-BENCH: A Benchmark for Evaluating the Steerability of Large Language Models
EMNLP 2025
<
1
…
7
8
9
…
80
>