Amodal Completion and Size Constancy in Natural Scenes

Abhishek Kar; Shubham Tulsiani; Joao Carreira; Jitendra Malik

2015 ICCV ICCV 2015

Amodal Completion and Size Constancy in Natural Scenes

Abstract

We consider the problem of enriching current object detection systems with veridical object sizes and relative depth estimates from a single image. There are several technical challenges to this, such as occlusions, lack of calibration data and the scale ambiguity between object size and distance. These have not been addressed in full generality in previous work. Here we propose to tackle these issues by building upon advances in object recognition and using recently created large-scale datasets. We first introduce the task of amodal bounding box completion, which aims to infer the the full extent of the object instances in the image. We then propose a probabilistic framework for learning category-specific object size distributions from available annotations and leverage these in conjunction with amodal completions to infer veridical sizes of objects in novel images. Finally, we introduce a focal length prediction approach that exploits scene recognition to overcome inherent scale ambiguities and demonstrate qualitative results on challenging real-world scenes.

🧭 Keyword Pioneer — amodal completion

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Abhishek Kar , Shubham Tulsiani , Joao Carreira , Jitendra Malik

Topics

Computer Vision > Analysis > Depth Estimation Computer Vision > Analysis > Object Detection Computer Vision > Analysis > Scene Understanding

Keywords

scene understanding object detection depth estimation amodal completion size estimation

Download PDF

Related papers

Cutting Edge: Soft Correspondences in Multimodal Scene Parsing 2015

Unsupervised Generation of a Viewpoint Annotated Car Dataset From Videos 2015

Depth-Based Hand Pose Estimation: Data, Methods, and Challenges 2015

Peeking Template Matching for Depth Extension 2015

Learning Semi-Supervised Representation Towards a Unified Optimization Framework for Semi-Supervised Learning 2015