Mining Better Samples for Contrastive Learning of Temporal Correspondence

Sangryul Jeon; Dongbo Min; Seungryong Kim; Kwanghoon Sohn

2021 CVPR CVPR 2021

Mining Better Samples for Contrastive Learning of Temporal Correspondence

Abstract

We present a novel framework for contrastive learning of pixel-level representation using only unlabeled video. Without the need of ground-truth annotation, our method is capable of collecting well-defined positive correspondences by measuring their confidences and well-defined negative ones by appropriately adjusting their hardness during training. This allows us to suppress the adverse impact of ambiguous matches and prevent a trivial solution from being yielded by too hard or too easy negative samples. To accomplish this, we incorporate three different criteria that ranges from a pixel-level matching confidence to a video-level one into a bottom-up pipeline, and plan a curriculum that is aware of current representation power for the adaptive hardness of negative samples during training. With the proposed method, state-of-the-art performance is attained over the latest approaches on several video label propagation tasks.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning

🧭 Keyword Pioneer — video label propagation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Sangryul Jeon , Dongbo Min , Seungryong Kim , Kwanghoon Sohn

Topics

Machine Learning > Learning Types > Contrastive Learning Machine Learning > Learning Types > Self-Supervised Learning Computer Vision > Analysis > Object Tracking Computer Vision > Processing > Video Understanding Deep Learning > Techniques > Contrastive Learning Computer Vision > Analysis > Video Understanding Deep Learning > Learning Types > Contrastive Learning

Keywords

representation learning contrastive learning curriculum learning self-supervised learning video understanding temporal correspondence negative sampling pixel-level representation video label propagation

Download PDF

Related papers

Learning To Reconstruct High Speed and High Dynamic Range Videos From Events 2021

DeFLOCNet: Deep Image Editing via Flexible Low-Level Controls 2021

Vx2Text: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs 2021

Coming Down to Earth: Satellite-to-Street View Synthesis for Geo-Localization 2021

Pose-Guided Human Animation From a Single Image in the Wild 2021