Do It Yourself: Learning Semantic Correspondence from Pseudo-Labels

Olaf Dünkel; Thomas Wimmer; Christian Theobalt; Christian Rupprecht; Adam Kortylewski

2025 ICCV ICCV 2025

Do It Yourself: Learning Semantic Correspondence from Pseudo-Labels

Abstract

Finding correspondences between semantically similar points across images and object instances is one of the everlasting challenges in computer vision. While large pre-trained vision models have recently been demonstrated as effective priors for semantic matching, they still suffer from ambiguities for symmetric objects or repeated object parts. We propose improving semantic correspondence estimation through 3D-aware pseudo-labeling. Specifically, we train an adapter to refine off-the-shelf features using pseudo-labels obtained via 3D-aware chaining, filtering wrong labels through relaxed cyclic consistency, and 3D spherical prototype mapping constraints. While reducing the need for dataset-specific annotations compared to prior work, we establish a new state-of-the-art on SPair-71k, achieving an absolute gain of over 4% and of over 7% compared to methods with similar supervision requirements. The generality of our proposed approach simplifies the extension of training to other data sources, which we demonstrate in our experiments.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning

🧭 Keyword Pioneer — 3d spherical prototype

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Olaf Dünkel , Thomas Wimmer , Christian Theobalt , Christian Rupprecht , Adam Kortylewski

Topics

Machine Learning > Learning Types > Self-Supervised Learning Deep Learning > Techniques > Pretraining Computer Vision > Analysis > Scene Understanding Computer Vision > Analysis > Semantic Segmentation Deep Learning > Learning Types > Self-Supervised Learning Computer Vision > Core AI > Computer Vision

Keywords

semantic matching vision-language model semantic correspondence cyclic consistency 3d spherical prototype 3d-aware chaining

Download PDF

Related papers

MA-CIR: A Multimodal Arithmetic Benchmark for Composed Image Retrieval 2025

SimMLM: A Simple Framework for Multi-modal Learning with Missing Modality 2025

MonSTeR: a Unified Model for Motion, Scene, Text Retrieval 2025

ASGS: Single-Domain Generalizable Open-Set Object Detection via Adaptive Subgraph Searching 2025

Robust Dataset Condensation using Supervised Contrastive Learning 2025