Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Core AI
Artificial Intelligence
›
Core AI
›
Interpretability
7318 directly classified papers
Papers per year
2003: 1
2006: 1
2007: 1
2008: 1
2009: 1
2010: 5
2012: 2
2013: 10
2014: 7
2015: 14
2016: 27
2017: 84
2018: 196
2019: 395
2020: 488
2021: 771
2022: 823
2023: 954
2024: 1360
2025: 1713
2026: 464
Papers
Breach in the Shield: Unveiling the Vulnerabilities of Large Language Models
EACL 2026
Targeted Syntactic Evaluation of Language Models on Georgian Case Alignment
EACL 2026
Early-Exit and Instant Confidence Translation Quality Estimation
EACL 2026
Understanding Jailbreak Success: A Study of Latent Space Dynamics in Large Language Models
EACL 2026
Can Activation Steering Generalize Across Languages? A Study on Syllogistic Reasoning in Language Models
EACL 2026
Personality Editing for Language Models through Adjusting Self-Referential Queries
EACL 2026
HateXScore: A Metric Suite for Evaluating Reasoning Quality in Hate Speech Explanations
EACL 2026
Feature Drift: How Fine-Tuning Repurposes Representations in LLMs
EACL 2026
Persona Switch: Mixing Distinct Perspectives in Decoding Time
EACL 2026
Detection of Adversarial Prompts with Model Predictive Entropy
EACL 2026
Unveiling Decision-Making in LLMs for Text Classification : Extraction of influential and interpretable concepts with Sparse Autoencoders
EACL 2026
Hearing Between the Lines: Unlocking the Reasoning Power of LLMs for Speech Evaluation
EACL 2026
Defeating Cerberus: Privacy-Leakage Mitigation in Vision Language Models
EACL 2026
Do LLMs model human linguistic variation? A case study in Hindi-English Verb code-mixing
EACL 2026
FactSelfCheck: Fact-Level Black-Box Hallucination Detection for LLMs
EACL 2026
Punctuations and Predicates in Language Models
EACL 2026
Argument-Based Consistency in Toxicity Explanations of LLMs
EACL 2026
Learning to Judge: LLMs Designing and Applying Evaluation Rubrics
EACL 2026
DeVisE: Towards the Behavioral Testing of Medical Large Language Models
EACL 2026
A Hybrid Confidence-Aware Framework for Arabic Toxicity Detection in Social Media
EACL 2026
A Knowledge Graph Based Diagnostic Framework for Analyzing Hallucinations in Arabic Machine Reading Comprehension
EACL 2026
Position: Biomedical NLP Demands Specialization, Not Generalization
EACL 2026
Funny or Persuasive, but Not Both: Evaluating Fine-Grained Multi-Concept Control in LLMs
EACL 2026
Confidence Leaps in LLM Reasoning: Early Stopping and Cross-Model Transfer
EACL 2026
CHiRPE: A Step Towards Real-World Clinical NLP with Clinician-Oriented Model Explanations
EACL 2026
<
1
2
3
4
5
…
293
>