2025 AAAI AAAI 2025

Semantic Ambiguity Modeling and Propagation for Fine-Grained Visual Cross View Geo-Localization

Abstract

Abstract Visual cross view geo-localization is generally approached within a joint retrieval-and-calibration framework. However, existing methods overlook semantic ambiguities arising from query and reference images characterized by low overlap, dynamic foregrounds, viewpoint changes, and perceptual aliasing. This makes it challenging to automatically control the relative importance of the two tasks, potentially compromising the retrieval task in favor of the offset regression. Consequently, the model may encounter conflicting dominating gradients during joint training. To address this, we propose to model the semantic ambiguity during the offset regression process by integrating associated uncertainty scores, represented as 2D Gaussian distributions, to mitigate negative transfer effects within the joint tasks. We further introduce an uncertainty-aware similarity metric to enhance similarity assessment between query and reference images, accounting for their semantic ambiguities. This metric propagates uncertainty scores into the retrieval task, focusing on certain samples and learning discriminative feature embeddings, allowing the model to adaptively handle conflicting dominating gradients during joint training. Extensive experiments demonstrate that our method improves the overall performance of the joint tasks, achieving state-of-the-art results on the VIGOR and CVACT datasets.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Machine Learning
🧭 Keyword Pioneer — offset regression
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio