Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Ross Girshick; Jeff Donahue; Trevor Darrell; Jitendra Malik

2014 CVPR CVPR 2014

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Abstract

Object detection performance, as measured on the canonical PASCAL VOC dataset, has plateaued in the last few years. The best-performing methods are complex ensemble systems that typically combine multiple low-level image features with high-level context. In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012---achieving a mAP of 53.3%. Our approach combines two key insights: (1) one can apply high-capacity convolutional neural networks (CNNs) to bottom-up region proposals in order to localize and segment objects and (2) when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost. Since we combine region proposals with CNNs, we call our method R-CNN: Regions with CNN features. We also present experiments that provide insight into what the network learns, revealing a rich hierarchy of image features. Source code for the complete system is available at http://www.cs.berkeley.edu/~rbg/rcnn.

🌱 Topic Pioneer — Transfer Learning

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning

📈 Trend Setter — Pretraining

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ross Girshick , Jeff Donahue , Trevor Darrell , Jitendra Malik

Topics

Deep Learning > Techniques > Pretraining Computer Vision > Analysis > Object Detection Computer Vision > Analysis > Semantic Segmentation Deep Learning > Techniques > Transfer Learning Deep Learning > Learning Types > Fine-Tuning

Keywords

semantic segmentation transfer learning object detection region proposal convolutional neural network

Download PDF

Related papers

Efficient Nonlinear Markov Models for Human Motion 2014

Occlusion Geodesics for Online Multi-Object Tracking 2014

A Principled Approach for Coarse-to-Fine MAP Inference 2014

Locally Optimized Product Quantization for Approximate Nearest Neighbor Search 2014

Fast and Accurate Image Matching with Cascade Hashing for 3D Reconstruction 2014