Contrastive Learning Inverts the Data Generating Process

Roland S. Zimmermann; Yash Sharma; Steffen Schneider; Matthias Bethge; Wieland Brendel

2021 ICML ICML 2021

Contrastive Learning Inverts the Data Generating Process

Abstract

Contrastive learning has recently seen tremendous success in self-supervised learning. So far, however, it is largely unclear why the learned representations generalize so effectively to a large variety of downstream tasks. We here prove that feedforward models trained with objectives belonging to the commonly used InfoNCE family learn to implicitly invert the underlying generative model of the observed data. While the proofs make certain statistical assumptions about the generative model, we observe empirically that our findings hold even if these assumptions are severely violated. Our theory highlights a fundamental connection between contrastive learning, generative modeling, and nonlinear independent component analysis, thereby furthering our understanding of the learned representations as well as providing a theoretical foundation to derive more effective contrastive losses.

🐣 Hot Topic Early Bird — theoretical analysis

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Roland S. Zimmermann , Yash Sharma , Steffen Schneider , Matthias Bethge , Wieland Brendel

Topics

Machine Learning > Learning Types > Contrastive Learning Machine Learning > Learning Types > Self-Supervised Learning Machine Learning > Optimization & Theory > Theory

Keywords

contrastive learning independent component analysis self-supervised learning theoretical analysis generative model

Download PDF

Related papers

GRAND: Graph Neural Diffusion 2021

Almost Optimal Anytime Algorithm for Batched Multi-Armed Bandits 2021

Straight to the Gradient: Learning to Use Novel Tokens for Neural Text Generation 2021

Differentiable Dynamic Quantization with Mixed Precision and Adaptive Resolution 2021

Dataset Dynamics via Gradient Flows in Probability Space 2021