Telling Left from Right: Identifying Geometry-Aware Semantic Correspondence

Junyi Zhang; Charles Herrmann; Junhwa Hur; Eric Chen; Varun Jampani; Deqing Sun; Ming-Hsuan Yang

2024 CVPR CVPR 2024

Telling Left from Right: Identifying Geometry-Aware Semantic Correspondence

Abstract

While pre-trained large-scale vision models have shown significant promise for semantic correspondence their features often struggle to grasp the geometry and orientation of instances. This paper identifies the importance of being geometry-aware for semantic correspondence and reveals a limitation of the features of current foundation models under simple post-processing. We show that incorporating this information can markedly enhance semantic correspondence performance with simple but effective solutions in both zero-shot and supervised settings. We also construct a new challenging benchmark for semantic correspondence built from an existing animal pose estimation dataset for both pre-training validating models. Our method achieves a PCK@0.10 score of 65.4 (zero-shot) and 85.6 (supervised) on the challenging SPair-71k dataset outperforming the state of the art by 5.5p and 11.0p absolute gains respectively. Our code and datasets are publicly available at: https://telling-left-from-right.github.io.

🌉 Interdisciplinary Bridge — Computer Vision and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Junyi Zhang , Charles Herrmann , Junhwa Hur , Eric Chen , Varun Jampani , Deqing Sun , Ming-Hsuan Yang

Topics

Machine Learning > Learning Types > Zero-Shot Learning Computer Vision > Analysis > Semantic Segmentation Machine Learning > Learning Types > Multi-Modal Learning Computer Vision > Analysis > Computer Vision

Keywords

zero-shot learning feature matching foundation model semantic correspondence

Download PDF

Related papers

DUSt3R: Geometric 3D Vision Made Easy 2024

Bezier Everywhere All at Once: Learning Drivable Lanes as Bezier Graphs 2024

NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows 2024

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization 2024

DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models 2024