TIER: Text-Image Entropy Regularization for Medical CLIP-style models

Anil Palepu; Andrew Beam

2023 MLHC MLHC 2023

TIER: Text-Image Entropy Regularization for Medical CLIP-style models

Abstract

In this paper, we introduce a novel regularization scheme on contrastive language-image pre-trained (CLIP) medical vision models. Our approach is based on the observation that, for many medical imaging tasks, text tokens should only describe a small number of image regions and, likewise, each image region should correspond to only a few text tokens. In CLIP-style models, this implies that text-token embeddings should have high similarity to only a small number of image-patch embeddings for a given image-text pair. We formalize this observation using a novel regularization scheme that penalizes the entropy of the text-token to image-patch similarity scores. We qualitatively and quantitatively demonstrate that the proposed regularization scheme improves localization by shrinking most of the pairwise text-token and image-patch similarity scores towards zero, thus achieving the desired effect. We demonstrate the promise of our approach in an important medical context, chest x-rays, where this underlying sparsity hypothesis naturally arises. Using our proposed approach, we achieve state of the art (SOTA) average zero-shot performance on the CheXpert and Padchest chest x-ray datasets, outperforming an unregularized version of the model and several recently published self-supervised models.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Anil Palepu , Andrew Beam

Topics

Machine Learning > Learning Types > Contrastive Learning Deep Learning > Architectures > Transformers

Keywords

medical imaging entropy regularization zero-shot classification chest x-ray contrastive language-image pre-training

Download PDF

Related papers

A Meta-Evaluation of Faithfulness Metrics for Long-Form Hospital-Course Summarization 2023

When More is Less: Incorporating Additional Datasets Can Hurt Performance By Introducing Spurious Correlations 2023

Privacy-preserving patient clustering for personalized federated learnings 2023

CDANs: Temporal Causal Discovery from Autocorrelated and Non-Stationary Time Series Data 2023

Deep Metric Learning for the Hemodynamics Inference with Electrocardiogram Signals 2023