Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Core AI
Artificial Intelligence
›
Core AI
›
Responsible AI
1991 directly classified papers
Papers per year
2011: 1
2016: 1
2017: 7
2018: 10
2019: 22
2020: 51
2021: 91
2022: 145
2023: 207
2024: 526
2025: 760
2026: 170
Papers
EuroGEST: Investigating gender stereotypes in multilingual language models
EMNLP 2025
Multiple LLM Agents Debate for Equitable Cultural Alignment
ACL 2025
LLMs know their vulnerabilities: Uncover Safety Gaps through Natural Distribution Shifts
ACL 2025
HESEIA: A community-based dataset for evaluating social biases in large language models, co-designed in real school settings in Latin America
EMNLP 2025
Alleviating Hallucinations from Knowledge Misalignment in Large Language Models via Selective Abstention Learning
ACL 2025
HalluLens: LLM Hallucination Benchmark
ACL 2025
Evaluation of Medical Large Language Models: Taxonomy, Review, and Directions
IJCAI 2025
Leaky Thoughts: Large Reasoning Models Are Not Private Thinkers
EMNLP 2025
Better Aligned with Survey Respondents or Training Data? Unveiling Political Leanings of LLMs on U.S. Supreme Court Cases
ACL 2025
PapersPlease: A Benchmark for Evaluating Motivational Values of Large Language Models Based on ERG Theory
ACL 2025
Representation Bending for Large Language Model Safety
ACL 2025
Leveraging Large Language Models to Measure Gender Representation Bias in Gendered Language Corpora
ACL 2025
Adapting Psycholinguistic Research for LLMs: Gender-inclusive Language in a Coreference Context
ACL 2025
MPO: Multilingual Safety Alignment via Reward Gap Optimization
ACL 2025
UTF: Under-trained Tokens as Fingerprints —— a Novel Approach to LLM Identification
ACL 2025
Guardians of Trust: Risks and Opportunities for LLMs in Mental Health
ACL 2025
Multilingual Large Language Models Leak Human Stereotypes across Language Boundaries
ACL 2025
ZJUKLAB at SemEval-2025 Task 4: Unlearning via Model Merging.
ACL 2025
MSA at SemEval-2025 Task 3: High Quality Weak Labeling and LLM Ensemble Verification for Multilingual Hallucination Detection
ACL 2025
GUIR at SemEval-2025 Task 4: Adaptive Weight Tuning with Gradual Negative Matching for LLM Unlearning
ACL 2025
RaggedyFive at SemEval-2025 Task 3: Hallucination Span Detection Using Unverifiable Answer Detection
ACL 2025
JU-CSE-NLP’25 at SemEval-2025 Task 4: Learning to Unlearn LLMs
ACL 2025
PROTECT: Policy-Related Organizational Value Taxonomy for Ethical Compliance and Trust
ACL 2025
UAlign: LLM Alignment Benchmark for the Ukrainian Language
ACL 2025
Understanding PII Leakage in Large Language Models: A Systematic Survey
IJCAI 2025
<
1
…
8
9
10
…
80
>