Pri3D: Can 3D Priors Help 2D Representation Learning?

Ji Hou; Saining Xie; Benjamin Graham; Angela Dai; Matthias Nießner

2021 ICCV ICCV 2021

Pri3D: Can 3D Priors Help 2D Representation Learning?

Abstract

Recent advances in 3D perception have shown impressive progress in understanding geometric structures of 3D shapes and even scenes. Inspired by these advances in geometric understanding, we aim to imbue image-based perception with representations learned under geometric constraints. We introduce an approach to learn view-invariant, geometry-aware representations for network pre-training, based on multi-view RGB-D data, that can then be effectively transferred to downstream 2D tasks. We propose to employ contrastive learning under both multi-view image constraints and image-geometry constraints to encode 3D priors into learned 2D representations. This results not only in improvement over 2D-only representation learning on the image-based tasks of semantic segmentation, instance segmentation, and object detection on real-world indoor datasets, but moreover, provides significant improvement in the low data regime. We show a significant improvement of 6.0% on semantic segmentation on full data as well as 11.9% on 20% data against our baselines on ScanNet.

❓ The Questioner

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning

🧭 Keyword Pioneer — 3d prior

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ji Hou , Saining Xie , Benjamin Graham , Angela Dai , Matthias Nießner

Topics

Deep Learning > Techniques > Pretraining Computer Vision > Analysis > 3D Vision Computer Vision > Analysis > Semantic Segmentation Machine Learning > Learning Types > Transfer Learning

Keywords

contrastive learning semantic segmentation transfer learning multi-view learning geometric structure 3d prior

Download PDF

Related papers

Spatial-Temporal Transformer for Dynamic Scene Graph Generation 2021

ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators 2021

A Broad Study on the Transferability of Visual Representations With Contrastive Learning 2021

Query Adaptive Few-Shot Object Detection With Heterogeneous Graph Convolutional Networks 2021

Self-Supervised Neural Networks for Spectral Snapshot Compressive Imaging 2021