Self-Supervised Learning of Pretext-Invariant Representations

Ishan Misra; Laurens van der Maaten

2020 CVPR CVPR 2020

Self-Supervised Learning of Pretext-Invariant Representations

Abstract

The goal of self-supervised learning from images is to construct image representations that are semantically meaningful via pretext tasks that do not require semantic annotations. Many pretext tasks lead to representations that are covariant with image transformations. We argue that, instead, semantic representations ought to be invariant under such transformations. Specifically, we develop Pretext-Invariant Representation Learning (PIRL, pronounced as `pearl') that learns invariant representations based on pretext tasks. We use PIRL with a commonly used pretext task that involves solving jigsaw puzzles. We find that PIRL substantially improves the semantic quality of the learned image representations. Our approach sets a new state-of-the-art in self-supervised learning from images on several popular benchmarks for self-supervised learning. Despite being unsupervised, PIRL outperforms supervised pre-training in learning image representations for object detection. Altogether, our results demonstrate the potential of self-supervised representations with good invariance properties.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning

🧭 Keyword Pioneer — semantic invariance

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ishan Misra , Laurens van der Maaten

Topics

Machine Learning > Core Methods > Representation Learning Machine Learning > Learning Types > Self-Supervised Learning Deep Learning > Techniques > Pretraining Deep Learning > Learning Types > Self-Supervised Learning Deep Learning > Learning Types > Representation Learning Computer Vision > Analysis > Image Classification

Keywords

representation learning self-supervised learning image representation pretext task semantic invariance jigsaw puzzle

Download PDF

Related papers

Deep Polarization Cues for Transparent Object Segmentation 2020

HRank: Filter Pruning Using High-Rank Feature Map 2020

Panoptic-Based Image Synthesis 2020

Select, Supplement and Focus for RGB-D Saliency Detection 2020

ClusterVO: Clustering Moving Instances and Estimating Visual Odometry for Self and Surroundings 2020