Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Learning Types
Machine Learning
›
Learning Types
›
Interpretability
173 directly classified papers
Papers per year
2013: 1
2015: 1
2016: 2
2017: 1
2018: 6
2019: 9
2020: 24
2021: 31
2022: 34
2023: 15
2024: 20
2025: 29
Papers
AUTOSUMM: A Comprehensive Framework for LLM-Based Conversation Summarization
ACL 2025
Towards Fine-Grained Interpretability: Counterfactual Explanations for Misclassification with Saliency Partition
CVPR 2025
LLaMAs Have Feelings Too: Unveiling Sentiment and Emotion Representations in LLaMA Models Through Probing
ACL 2025
ICR Probe: Tracking Hidden State Dynamics for Reliable Hallucination Detection in LLMs
ACL 2025
Serial Position Effects of Large Language Models
ACL 2025
Natural Language Counterfactual Explanations in Financial Text Classification: A Comparison of Generators and Evaluation Metrics
ACL 2025
Position-aware Automatic Circuit Discovery
ACL 2025
Are the Hidden States Hiding Something? Testing the Limits of Factuality-Encoding Capabilities in LLMs
ACL 2025
Tuning-Free Accountable Intervention for LLM Deployment – a Metacognitive Approach
AAAI 2025
Attributive Reasoning for Hallucination Diagnosis of Large Language Models
AAAI 2025
Towards Trustable SHAP Scores
AAAI 2025
Conditional Feature Importance with Generative Modeling Using Adversarial Random Forests
AAAI 2025
Prompt-CAM: Making Vision Transformers Interpretable for Fine-Grained Analysis
CVPR 2025
Improving Large Language Model Confidence Estimates using Extractive Rationales for Classification
ACL 2025
Identifying Pre-training Data in LLMs: A Neuron Activation-Based Detection Framework
EMNLP 2025
Even-if Explanations: Formal Foundations, Priorities and Complexity
AAAI 2025
How Your Location Relates to Health: Variable Importance and Interpretable Machine Learning for Environmental and Sociodemographic Data
AAAI 2025
Interpretable DNFs
IJCAI 2025
Intervening in Black Box: Concept Bottleneck Model for Enhancing Human Neural Network Mutual Understanding
ICCV 2025
Walking the Web of Concept-Class Relationships in Incrementally Trained Interpretable Models
AAAI 2025
GeoPro-Net: Learning Interpretable Spatiotemporal Prediction Models Through Statistically-Guided Geo-Prototyping
AAAI 2025
BEE: Metric-Adapted Explanations via Baseline Exploration-Exploitation
AAAI 2025
Interpretable Image Classification via Non-parametric Part Prototype Learning
CVPR 2025
Accurate Estimation of Feature Importance Faithfulness for Tree Models
AAAI 2025
Unsupervised Hallucination Detection by Inspecting Reasoning Processes
EMNLP 2025
<
1
2
3
4
5
6
7
>