Explanations explained. Influence of Free-text Explanations on LLMs and the Role of Implicit Knowledge
Abstract
AbstractIn this work, we investigate the relationship between the quality of explanations produced by different models and the amount of implicit knowledge the are able to provide beyond the input. We approximate explanation quality via accuracy on a downstream task with a standardized pipeline (GEISER) and study its correlation with three different association measures, each capturing different aspects of implicitness, defined as a combination of relevance and novelty. We conduct experiments with three SOTA LLMs on four tasks involving implicit knowledge, with explanations either confirming or contradicting the correct label. Our results demonstrate that providing quality explanations consistently improves the accuracy of LLM predictions, even when the models are not explicitly trained to take explanations as input, and underline the correlation between implicit content delivered by the explanation and its effectiveness.