2026 WACV WACV 2026

Patch Your Matcher: Correspondence-Aware Image-to-Image Translation Unlocks Cross-Modal Matching via Single-Modality Priors

Abstract

Matching between image modalities is a high-impact research area. Current state-of-the-art (SOTA) methods rely on extensive multi-million-scale training protocols, which demand significant computational resources. However, the learned cross-modal mapping remains largely opaque and locked within the trained matcher, with limited options for downstream use or transfer to other matchers. To enable such capabilities, we propose Patch Your Matcher (PYM) (https://xaf-cv.github.io/pym/), a highly adaptive method for leveraging pre-trained single-modality matchers for cross-modal matching by co-learning an explicit two-view geometrically consistent mapping. PYM learns image-to-image (I2I) translations that map new modalities into the original matcher's modality using a novel adversarial learning approach based on explicit evaluation of 6 DoF two-view correspondence plausibility. Trained with the semi-dense ELoFTR [80], our approach delivers substantially better cross-modal matching than classic I2I techniques, and recovers 97.05% of the matching precision of the extensively trained SOTA multi-modal MINIMA [62] variant. PYM also significantly boosts cross-modal matching performance of uni-modal sparse LightGlue [50] and dense RoMA [23] matchers, demonstrating high transferability of the learned mapping.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio