Using Captum to Explain Generative Language Models

Vivek Miglani; Aobo Yang; Aram Markosyan; Diego Garcia-Olano; Narine Kokhlikyan

2023 EMNLP EMNLP 2023

Using Captum to Explain Generative Language Models

Abstract

AbstractCaptum is a comprehensive library for model explainability in PyTorch, offering a range of methods from the interpretability literature to enhance users’ understanding of PyTorch models. In this paper, we introduce new features in Captum that are specifically designed to analyze the behavior of generative language models. We provide an overview of the available functionalities and example applications of their potential for understanding learned associations within generative language models.

🐣 Hot Topic Early Bird — generative language model

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Vivek Miglani , Aobo Yang , Aram Markosyan , Diego Garcia-Olano , Narine Kokhlikyan

Topics

Artificial Intelligence > Core AI > Interpretability

Keywords

feature attribution model interpretability model explainability generative language model

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023