Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification

Yue Yang; Artemis Panagopoulou; Shenghao Zhou; Daniel Jin; Chris Callison-Burch; Mark Yatskar

2023 CVPR CVPR 2023

Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification

Abstract

Concept Bottleneck Models (CBM) are inherently interpretable models that factor model decisions into human-readable concepts. They allow people to easily understand why a model is failing, a critical feature for high-stakes applications. CBMs require manually specified concepts and often under-perform their black box counterparts, preventing their broad adoption. We address these shortcomings and are first to show how to construct high-performance CBMs without manual specification of similar accuracy to black box models. Our approach, Language Guided Bottlenecks (LaBo), leverages a language model, GPT-3, to define a large space of possible bottlenecks. Given a problem domain, LaBo uses GPT-3 to produce factual sentences about categories to form candidate concepts. LaBo efficiently searches possible bottlenecks through a novel submodular utility that promotes the selection of discriminative and diverse information. Ultimately, GPT-3's sentential concepts can be aligned to images using CLIP, to form a bottleneck layer. Experiments demonstrate that LaBo is a highly effective prior for concepts important to visual recognition. In the evaluation with 11 diverse datasets, LaBo bottlenecks excel at few-shot classification: they are 11.7% more accurate than black box linear probes at 1 shot and comparable with more data. Overall, LaBo demonstrates that inherently interpretable models can be widely applied at similar, or better, performance than black box approaches.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Yue Yang , Artemis Panagopoulou , Shenghao Zhou , Daniel Jin , Chris Callison-Burch , Mark Yatskar

Topics

Machine Learning > Core Methods > Classification Machine Learning > Learning Types > Zero-Shot Learning Deep Learning > Techniques > Model Architecture Deep Learning > Learning Types > Few-Shot Learning Computer Vision > Core AI > Interpretability Deep Learning > Models > Vision-Language Models

Keywords

image classification zero-shot learning few-shot learning interpretable machine learning language model vision-language model interpretable model concept bottleneck concept bottleneck model concept alignment few-shot classification

Download PDF

Related papers

CORA: Adapting CLIP for Open-Vocabulary Detection With Region Prompting and Anchor Pre-Matching 2023

3DAvatarGAN: Bridging Domains for Personalized Editable Avatars 2023

Physics-Driven Diffusion Models for Impact Sound Synthesis From Videos 2023

Transductive Few-Shot Learning With Prototype-Based Label Propagation by Iterative Graph Refinement 2023

EXIF As Language: Learning Cross-Modal Associations Between Images and Camera Metadata 2023