Exploring Data-Efficient 3D Scene Understanding With Contrastive Scene Contexts

Ji Hou; Benjamin Graham; Matthias Niessner; Saining Xie

2021 CVPR CVPR 2021

Exploring Data-Efficient 3D Scene Understanding With Contrastive Scene Contexts

Abstract

The rapid progress in 3D scene understanding has come with growing demand for data; however, collecting and annotating 3D scenes (e.g. point clouds) are notoriously hard. For example, the number of scenes (e.g. indoor rooms) that can be accessed and scanned might be limited; even given sufficient data, acquiring 3D labels (e.g. instance masks) requires intensive human labor. In this paper, we explore data-efficient learning for 3D point cloud. As a first step towards this direction, we propose Contrastive Scene Contexts, a 3D pre-training method that makes use of both point-level correspondences and spatial contexts in a scene. Our method achieves state-of-the-art results on a suite of benchmarks where training data or labels are scarce. Our study reveals that exhaustive labelling of 3D point clouds might be unnecessary; and remarkably, on ScanNet, even using 0.1% of point labels, we still achieve 89% (instance segmentation) and 96% (semantic segmentation) of the baseline performance that uses full annotations.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning

🐣 Hot Topic Early Bird — 3d scene understanding

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ji Hou , Benjamin Graham , Matthias Niessner , Saining Xie

Topics

Machine Learning > Learning Types > Contrastive Learning Computer Vision > Analysis > 3D Vision Deep Learning > Learning Types > Self-Supervised Learning Deep Learning > Learning Types > Contrastive Learning Computer Vision > Processing > Point Cloud Processing

Keywords

contrastive learning semantic segmentation self-supervised learning point cloud 3d scene understanding instance segmentation

Download PDF

Related papers

Learning To Reconstruct High Speed and High Dynamic Range Videos From Events 2021

DeFLOCNet: Deep Image Editing via Flexible Low-Level Controls 2021

Vx2Text: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs 2021

Coming Down to Earth: Satellite-to-Street View Synthesis for Geo-Localization 2021

Pose-Guided Human Animation From a Single Image in the Wild 2021