Artificial Intelligence › Core AI ›

Responsible AI

1991 directly classified papers

Papers per year

Papers

Do Large Language Models Reflect Demographic Pluralism in Safety? EACL 2026

Persona Prompting as a Lens on LLM Social Reasoning EACL 2026

Engagement Undermines Safety: How Stereotypes and Toxicity Shape Humor in Language Models EACL 2026

Auditing Language Model Unlearning via Information Decomposition EACL 2026

From Numbers to Narratives: Efficient Language Model-Based Detection for Safety-Critical Minority Classes EACL 2026

CAAC: Confidence-Aware Attention Calibration to Reduce Hallucinations in Large Vision-Language Models WACV 2026

’A Woman is More Culturally Knowledgeable than A Man?’: The Effect of Personas on Cultural Norm Interpretation in LLMs EACL 2026

Bias in the East, Bias in the West: A Bilingual Analysis of LLM Political Bias on U.S.- and China-Related Issues EACL 2026

Seeing All Sides: Multi-Perspective In-Context Learning for Subjective NLP EACL 2026

Stylistic Transfer from Annotator Communities to Large Language Models EACL 2026

Rethinking the Evaluation of Alignment Methods: Insights into Diversity, Generalisation, and Safety EACL 2026

Different Time, Different Language: Revisiting the Bias Against Non-Native Speakers in GPT Detectors EACL 2026

The Clinical Fingerprint: Comparing the Rhetorical Integrity and Epistemic Safety of Human Physicians and Large Language Models EACL 2026

Being Kind Isn’t Always Being Safe: Diagnosing Affective Hallucination in LLMs EACL 2026

Position Paper: How Should We Responsibly Adopt LLMs in the Peer Review Process? EACL 2026

VortexPIA: Indirect Prompt Injection Attack against LLMs for Efficient Extraction of User Privacy EACL 2026

CodeGuard: Improving LLM Guardrails in CS Education EACL 2026

The Unintended Trade-off of AI Alignment: Balancing Hallucination Mitigation and Safety in LLMs EACL 2026

NLP Privacy Risk Identification in Social Media (NLP-PRISM): A Survey EACL 2026

Teaching and Critiquing Conceptualization and Operationalization in NLP EACL 2026

Detecting Subtle Biases: An Ethical Lens on Underexplored Areas in AI Language Models Biases EACL 2026

Jailbreaks as Inference-Time Alignment: A Framework for Understanding Safety Failures in LLMs EACL 2026

FanarGuard: A Culturally-Aware Moderation Filter for Arabic Language Models EACL 2026

Do Psychometric Tests Work for Large Language Models? Evaluation of Tests on Sexism, Racism, and Morality EACL 2026

CAIRE: Cultural Attribution of Images with Retrieval EACL 2026