← Core Methods

Machine Learning › Core Methods ›

Interpretability

349 directly classified papers

Papers per year

Papers

On the Universal Truthfulness Hyperplane Inside LLMs EMNLP 2024

Explaining Graph Neural Networks with Large Language Models: A Counterfactual Perspective on Molecule Graphs EMNLP 2024

Disentangling Interpretable Factors with Supervised Independent Subspace Principal Component Analysis NIPS 2024

An L* Algorithm for Deterministic Weighted Regular Languages EMNLP 2024

Beyond Label Attention: Transparency in Language Models for Automated Medical Coding via Dictionary Learning EMNLP 2024

Does Large Language Model Contain Task-Specific Neurons? EMNLP 2024

Beyond Accuracy: Ensuring Correct Predictions With Correct Rationales NIPS 2024

MICA: Towards Explainable Skin Lesion Diagnosis via Multi-Level Image-Concept Alignment AAAI 2024

Empowering CAM-Based Methods with Capability to Generate Fine-Grained and High-Faithfulness Explanations AAAI 2024

Exploring Diverse Representations for Open Set Recognition AAAI 2024

Keep the Faith: Faithful Explanations in Convolutional Neural Networks for Case-Based Reasoning AAAI 2024

Accelerating the Global Aggregation of Local Explanations AAAI 2024

Generative Model for Decision Trees AAAI 2024

Latent Concept-based Explanation of NLP Models EMNLP 2024

Robust Stochastic Graph Generator for Counterfactual Explanations AAAI 2024

Backward Lens: Projecting Language Model Gradients into the Vocabulary Space EMNLP 2024

Evaluating Pre-trial Programs Using Interpretable Machine Learning Matching Algorithms for Causal Inference AAAI 2024

G–LIME: Statistical Learning for Local Interpretations of Deep Neural Networks Using Global Priors (Abstract Reprint) AAAI 2024

Interactive Mars Image Content-Based Search with Interpretable Machine Learning AAAI 2024

Cluster-Norm for Unsupervised Probing of Knowledge EMNLP 2024

LACIE: Listener-Aware Finetuning for Calibration in Large Language Models NIPS 2024

Dissect Black Box: Interpreting for Rule-Based Explanations in Unsupervised Anomaly Detection NIPS 2024

Learning Bottleneck Concepts in Image Classification CVPR 2023

Shortcomings of Top-Down Randomization-Based Sanity Checks for Evaluations of Deep Neural Network Explanations CVPR 2023

Overlooked Factors in Concept-Based Explanations: Dataset Choice, Concept Learnability, and Human Capability CVPR 2023