SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object Labels

Jianxiong Xiao; Andrew Owens; Antonio Torralba

2013 ICCV ICCV 2013

SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object Labels

Abstract

Existing scene understanding datasets contain only a limited set of views of a place, and they lack representations of complete 3D spaces. In this paper, we introduce SUN3D, a large-scale RGB-D video database with camera pose and object labels, capturing the full 3D extent of many places. The tasks that go into constructing such a dataset are difficult in isolation hand-labeling videos is painstaking, and structure from motion (SfM) is unreliable for large spaces. But if we combine them together, we make the dataset construction task much easier. First, we introduce an intuitive labeling tool that uses a partial reconstruction to propagate labels from one frame to another. Then we use the object labels to fix errors in the reconstruction. For this, we introduce a generalization of bundle adjustment that incorporates object-to-object correspondences. This algorithm works by constraining points for the same object from different frames to lie inside a fixed-size bounding box, parameterized by its rotation and translation. The SUN3D database, the source code for the generalized bundle adjustment, and the web-based 3D annotation tool are all available at http://sun3d.cs.princeton.edu.

🚀 Conference Pioneer — ICCV 2013

📈 Trend Setter — Egocentric Vision

🧭 Keyword Pioneer — rgb-d video

🐣 Hot Topic Early Bird — structure from motion

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Jianxiong Xiao , Andrew Owens , Antonio Torralba

Topics

Computer Vision > Analysis > 3D Vision Computer Vision > Analysis > Scene Understanding Computer Vision > Domain-Specific > Egocentric Vision

Keywords

3d reconstruction scene understanding depth estimation structure from motion rgb-d video bundle adjustment object labeling

Download PDF

Related papers

Large-Scale Multi-resolution Surface Reconstruction from RGB-D Sequences 2013

Cascaded Shape Space Pruning for Robust Facial Landmark Detection 2013

Unsupervised Intrinsic Calibration from a Single Frame Using a "Plumb-Line" Approach 2013

Accurate and Robust 3D Facial Capture Using a Single RGBD Camera 2013

From Where and How to What We See 2013