PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation

Danfei Xu; Dragomir Anguelov; Ashesh Jain

2018 CVPR CVPR 2018

PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation

Abstract

We present PointFusion, a generic 3D object detection method that leverages both image and 3D point cloud information. Unlike existing methods that either use multi-stage pipelines or hold sensor and dataset-specific assumptions, PointFusion is conceptually simple and application-agnostic. The image data and the raw point cloud data are independently processed by a CNN and a PointNet architecture, respectively. The resulting outputs are then combined by a novel fusion network, which predicts multiple 3D box hypotheses and their confidences, using the input 3D points as spatial anchors. We evaluate PointFusion on two distinctive datasets: the KITTI dataset that features driving scenes captured with a lidar-camera setup, and the SUN-RGBD dataset that captures indoor environments with RGB-D cameras. Our model is the first one that is able to perform on par or better than the state-of-the-art on these diverse datasets without any dataset-specific model tuning.

🧭 Keyword Pioneer — deep sensor fusion

🐣 Hot Topic Early Bird — sensor fusion

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Danfei Xu , Dragomir Anguelov , Ashesh Jain

Topics

Computer Vision > Analysis > 3D Vision Computer Vision > Analysis > Object Detection Computer Vision > Domain-Specific > Autonomous Driving Computer Vision > Core AI > Multi-Modal Learning

Keywords

point cloud multi-modal learning sensor fusion deep learning 3d object detection bounding box estimation deep sensor fusion

Download PDF

Related papers

Multi-Shot Pedestrian Re-Identification via Sequential Decision Making 2018

Multi-Cue Correlation Filters for Robust Visual Tracking 2018

Pointwise Convolutional Neural Networks 2018

Learning Attentions: Residual Attentional Siamese Network for High Performance Online Visual Tracking 2018

Image Generation From Scene Graphs 2018