Learning Local RGB-to-CAD Correspondences for Object Pose Estimation

Georgios Georgakis; Srikrishna Karanam; Ziyan Wu; Jana Kosecka

2019 ICCV ICCV 2019

Learning Local RGB-to-CAD Correspondences for Object Pose Estimation

Abstract

We consider the problem of 3D object pose estimation. While much recent work has focused on the RGB domain, the reliance on accurately annotated images limits generalizability and scalability. On the other hand, the easily available object CAD models are rich sources of data, providing a large number of synthetically rendered images. In this paper, we solve this key problem of existing methods requiring expensive 3D pose annotations by proposing a new method that matches RGB images to CAD models for object pose estimation. Our key innovations compared to existing work include removing the need for either real-world textures for CAD models or explicit 3D pose annotations for RGB images. We achieve this through a series of objectives that learn how to select keypoints and enforce viewpoint and modality invariance across RGB images and CAD model renderings. Our experiments demonstrate that the proposed method can reliably estimate object pose in RGB images and generalize to object instances not seen during training.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision

🧭 Keyword Pioneer — keypoint selection

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors

Georgios Georgakis , Srikrishna Karanam , Ziyan Wu , Jana Kosecka

Topics

Artificial Intelligence > Learning Paradigms > Transfer Learning Computer Vision > Analysis > 3D Vision Computer Vision > Analysis > Object Detection

Keywords

viewpoint invariance cad model object pose estimation synthetic image rgb image keypoint selection

Download PDF

Related papers

Hierarchical Self-Attention Network for Action Localization in Videos 2019

StructureFlow: Image Inpainting via Structure-Aware Appearance Flow 2019

Overcoming Catastrophic Forgetting With Unlabeled Data in the Wild 2019

Compact Trilinear Interaction for Visual Question Answering 2019

A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose Estimation From a Single Depth Image 2019