ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes

Angela Dai; Angel X. Chang; Manolis Savva; Maciej Halber; Thomas Funkhouser; Matthias Niessner

2017 CVPR CVPR 2017

ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes

Abstract

A key requirement for leveraging supervised deep learning methods is the availability of large, labeled datasets. Unfortunately, in the context of RGB-D scene understanding, very little data is available -- current datasets cover a small range of scene views and have limited semantic annotations. To address this issue, we introduce ScanNet, an RGB-D video dataset containing 2.5M views in 1513 scenes annotated with 3D camera poses, surface reconstructions, and semantic segmentations. To collect this data, we designed an easy-to-use and scalable RGB-D capture system that includes automated surface reconstruction and crowdsourced semantic annotation. We show that using this data helps achieve state-of-the-art performance on several 3D scene understanding tasks, including 3D object classification, semantic voxel labeling, and CAD model retrieval.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning

🐣 Hot Topic Early Bird — surface reconstruction

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Angela Dai , Angel X. Chang , Manolis Savva , Maciej Halber , Thomas Funkhouser , Matthias Niessner

Topics

Machine Learning > Application Areas > Efficient Computing Computer Vision > Analysis > 3D Vision Computer Vision > Domain-Specific > Remote Sensing Computer Vision > Processing > Semantic Segmentation Deep Learning > Learning Types > Representation Learning Computer Vision > Domain-Specific > 3D Vision

Keywords

3d reconstruction semantic segmentation scene understanding depth estimation surface reconstruction object classification crowdsourced annotation

Download PDF

Related papers

Deep Outdoor Illumination Estimation 2017

SRN: Side-output Residual Network for Object Symmetry Detection in the Wild 2017

Weakly Supervised Semantic Segmentation Using Web-Crawled Videos 2017

FASON: First and Second Order Information Fusion Network for Texture Recognition 2017

Recurrent Convolutional Neural Networks for Continuous Sign Language Recognition by Staged Optimization 2017