Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Core AI
Artificial Intelligence
›
Core AI
›
Responsible AI
1991 directly classified papers
Papers per year
2011: 1
2016: 1
2017: 7
2018: 10
2019: 22
2020: 51
2021: 91
2022: 145
2023: 207
2024: 526
2025: 760
2026: 170
Papers
ARGH-Mark: Anchor-Synchronized Watermarking with Hamming Correction for Robust and Quality-Preserving LLM Attribution
AAAI 2026
Editing as Unlearning: Are Knowledge Editing Methods Strong Baselines for Large Language Model Unlearning?
AAAI 2026
How Much Do Large Language Model Cheat on Evaluation? Benchmarking Overestimation Under the One-Time-Pad-Based Framework
AAAI 2026
On the Alignment of Large Language Models with Global Human Opinion
AAAI 2026
DarkBench+: An Extended Benchmark for Evaluating Dark Patterns in Large Language Models
AAAI 2026
Towards Benchmarking Privacy Vulnerabilities in Selective Forgetting with Large Language Models
AAAI 2026
Detecting Compute Structuring in AI Governance Is Likely Feasible
AAAI 2026
Designing Incident Reporting Systems for Harms from General-Purpose AI
AAAI 2026
Fine-Grained Interpretation of Political Opinions in Large Language Models
AAAI 2026
The Confidence Trap: Gender Bias and Predictive Certainty in LLMs
AAAI 2026
Robust Learning from Noisily Labeled Long-Tailed Data via Fairness Regularizer
AAAI 2026
iSeal: Encrypted Fingerprinting for Reliable LLM Ownership Verification
AAAI 2026
SafeR-CLIP: Mitigating NSFW Content in Vision-Language Models While Preserving Pre-Trained Knowledge
AAAI 2026
T2I-RiskyPrompt: A Benchmark for Safety Evaluation, Attack, and Defense on Text-to-Image Model
AAAI 2026
AURA: Affordance-Understanding and Risk-aware Alignment Technique for Large Language Models
AAAI 2026
ACID Test: A Benchmark for Cultural Safety and Alignment in LALMs
AAAI 2026
Identifying Features Associated with Bias Against 93 Stigmatized Groups in Language Models and Guardrail Model Safety Mitigation
AAAI 2026
Silenced Biases: The Dark Side LLMs Learned to Refuse
AAAI 2026
Reducing the Scope of Language Models
AAAI 2026
Steering Representations, Safeguarding Privacy: A Cross-Modal Privacy Protection Method for Generative AI
AAAI 2026
ShadeEdit: A Utility-Preserving and Defense-Evasive Knowledge Manipulation Attack in Federated LLMs
AAAI 2026
SCOPE: Intrinsic Semantic Space Control for Mitigating Copyright Infringement in LLMs
AAAI 2026
ALTER: Asymmetric LoRA for Token-Entropy-Guided Unlearning of LLMs
AAAI 2026
Fairness Perceptions of Large Language Models
AAAI 2026
Beyond World Models: Rethinking Understanding in AI Models
AAAI 2026
<
1
2
3
4
5
…
80
>