Multi-View 3D Reconstruction With Transformers

Dan Wang; Xinrui Cui; Xun Chen; Zhengxia Zou; Tianyang Shi; Septimiu Salcudean; Z. Jane Wang; Rabab Ward

2021 ICCV ICCV 2021

Multi-View 3D Reconstruction With Transformers

Abstract

Deep CNN-based methods have so far achieved the state of the art results in multi-view 3D object reconstruction. Despite the considerable progress, the two core modules of these methods - view feature extraction and multi-view fusion, are usually investigated separately, and the relations among multiple input views are rarely explored. Inspired by the recent great success in Transformer models, we reformulate the multi-view 3D reconstruction as a sequence-to-sequence prediction problem and propose a framework named 3D Volume Transformer. Unlike previous CNN-based methods using a separate design, we unify the feature extraction and view fusion in a single Transformer network. A natural advantage of our design lies in the exploration of view-to-view relationships using self-attention among multiple unordered inputs. On ShapeNet - a large-scale 3D reconstruction benchmark, our method achieves a new state-of-the-art accuracy in multi-view reconstruction with fewer parameters (70% less) than CNN-based methods. Experimental results also suggest the strong scaling capability of our method. Our code will be made publicly available.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning

🧭 Keyword Pioneer — volume transformer

🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Deep Learning, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization

Authors

Dan Wang , Xinrui Cui , Xun Chen , Zhengxia Zou , Tianyang Shi , Septimiu Salcudean , Z. Jane Wang , Rabab Ward

Topics

Deep Learning > Architectures > Transformers Computer Vision > Analysis > 3D Vision

Keywords

multi-view 3d reconstruction view fusion volume transformer

Download PDF

Related papers

Spatial-Temporal Transformer for Dynamic Scene Graph Generation 2021

ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators 2021

A Broad Study on the Transferability of Visual Representations With Contrastive Learning 2021

Query Adaptive Few-Shot Object Detection With Heterogeneous Graph Convolutional Networks 2021

Self-Supervised Neural Networks for Spectral Snapshot Compressive Imaging 2021