Predicting an Object Location Using a Global Image Representation

Jose A. Rodriguez Serrano; Diane Larlus

2013 ICCV ICCV 2013

Predicting an Object Location Using a Global Image Representation

Abstract

We tackle the detection of prominent objects in images as a retrieval task: given a global image descriptor, we find the most similar images in an annotated dataset, and transfer the object bounding boxes. We refer to this approach as data driven detection (DDD), that is an alternative to sliding windows. Previous works have used similar notions but with task-independent similarities and representations, i.e. they were not tailored to the end-goal of localization. This article proposes two contributions: (i) a metric learning algorithm and (ii) a representation of images as object probability maps, that are both optimized for detection. We show experimentally that these two contributions are crucial to DDD, do not require costly additional operations, and in some cases yield comparable or better results than state-of-the-art detectors despite conceptual simplicity and increased speed. As an application of prominent object detection, we improve fine-grained categorization by precropping images with the proposed approach.

🚀 Conference Pioneer — ICCV 2013

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning

🧭 Keyword Pioneer — bounding box transfer

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Jose A. Rodriguez Serrano , Diane Larlus

Topics

Machine Learning > Core Methods > Metric Learning Computer Vision > Analysis > Object Detection Machine Learning > Learning Types > Transfer Learning Deep Learning > Learning Types > Representation Learning

Keywords

metric learning feature learning transfer learning object detection image representation bounding box transfer data driven detection

Download PDF

Related papers

Large-Scale Multi-resolution Surface Reconstruction from RGB-D Sequences 2013

Cascaded Shape Space Pruning for Robust Facial Landmark Detection 2013

Unsupervised Intrinsic Calibration from a Single Frame Using a "Plumb-Line" Approach 2013

Accurate and Robust 3D Facial Capture Using a Single RGBD Camera 2013

From Where and How to What We See 2013