Trade-off Between Efficiency and Consistency for Removal-based Explanations

Yifan Zhang; Haowei He; Zhiquan Tan; Yang Yuan

2023 NIPS NeurIPS 2023

Trade-off Between Efficiency and Consistency for Removal-based Explanations

Abstract

In the current landscape of explanation methodologies, most predominant approaches, such as SHAP and LIME, employ removal-based techniques to evaluate the impact of individual features by simulating various scenarios with specific features omitted. Nonetheless, these methods primarily emphasize efficiency in the original context, often resulting in general inconsistencies. In this paper, we demonstrate that such inconsistency is an inherent aspect of these approaches by establishing the Impossible Trinity Theorem, which posits that interpretability, efficiency, and consistency cannot hold simultaneously. Recognizing that the attainment of an ideal explanation remains elusive, we propose the utilization of interpretation error as a metric to gauge inefficiencies and inconsistencies. To this end, we present two novel algorithms founded on the standard polynomial basis, aimed at minimizing interpretation error. Our empirical findings indicate that the proposed methods achieve a substantial reduction in interpretation error, up to 31.8 times lower when compared to alternative techniques.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🧭 Keyword Pioneer — interpretation error

🐣 Hot Topic Early Bird — theoretical analysis

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio

Authors

Yifan Zhang , Haowei He , Zhiquan Tan , Yang Yuan

Topics

Artificial Intelligence > Core AI > Interpretability Machine Learning > Optimization & Theory > Optimization Machine Learning > Optimization & Theory > Theory Machine Learning > Core Methods > Interpretability

Keywords

feature attribution theoretical analysis removal-based explanation explanation method interpretation error

Download PDF

Related papers

Risk-Averse Model Uncertainty for Distributionally Robust Safe Reinforcement Learning 2023

Generative Modeling through the Semi-dual Formulation of Unbalanced Optimal Transport 2023

Self-Supervised Motion Magnification by Backpropagating Through Optical Flow 2023

Diffused Task-Agnostic Milestone Planner 2023

Characterizing Graph Datasets for Node Classification: Homophily-Heterophily Dichotomy and Beyond 2023