Artificial Intelligence › Core AI ›

Interpretability

7318 directly classified papers

Papers per year

Papers

What Makes a Good Query? Measuring the Impact of Human-Confusing Linguistic Features on LLM Performance EACL 2026

HALP: Detecting Hallucinations in Vision-Language Models without Generating a Single Token EACL 2026

SAVE: Sparse Autoencoder-Driven Visual Information Enhancement for Mitigating Object Hallucination WACV 2026

SSplain: Sparse and Smooth Explainer for Retinopathy of Prematurity Classification WACV 2026

Spotlight Your Instructions: Instruction-following with Dynamic Attention Steering EACL 2026

Direct Visual Grounding by Directing Attention of Visual Tokens WACV 2026

Causality-Driven Audits of Model Robustness WACV 2026

Feature Inversion as a Lens on Vision Encoders WACV 2026

Now You Hear Me: Audio Narrative Attacks Against Large Audio–Language Models EACL 2026

Mary, the Cheeseburger-Eating Vegetarian: Do LLMs Recognize Incoherence in Narratives? EACL 2026

LASOR: Towards Clinically Transparent and Explainable Ophthalmic Report Generation via Lesion-Aware Segmentation WACV 2026

AudioSAE: Towards Understanding of Audio-Processing Models with Sparse AutoEncoders EACL 2026

Vision-Language Models Align with Human Neural Representations in Concept Processing EACL 2026

Quantitative Lect Description: A Case Study of Lemko from the Field Data of 1920s-1930s EACL 2026

Measuring Mechanistic Independence: Can Bias Be Removed Without Erasing Demographics? EACL 2026

Steering Large Language Models for Machine Translation Personalization EACL 2026

RotBench: Evaluating Multi-modal Large Language Models on Identifying Image Rotation EACL 2026

FG-TRACER: Tracing Information Flow in Multimodal Large Language Models in Free-Form Generation WACV 2026

Democratic or Authoritarian? Probing a New Dimension of Political Biases in Large Language Models EACL 2026

Test Time Adaptation Using Adaptive Quantile Recalibration WACV 2026

SceneEval: Evaluating Semantic Coherence in Text-Conditioned 3D Indoor Scene Synthesis WACV 2026

MUSE: Model-based Uncertainty-aware Similarity Estimation for zero-shot 2D Object Detection and Segmentation WACV 2026

Uncertainty-Aware Subset Selection for Robust Visual Explainability under Distribution Shifts WACV 2026

Reasoning about Uncertainty: Do Reasoning Models Know When They Don’t Know? EACL 2026

Evidence Grounding vs. Memorization: Why Neural Semantics Matter for Knowledge Graph Fact Verification EACL 2026