SenCLIP: Enhancing Zero-Shot Land-Use Mapping for Sentinel-2 with Ground-Level Prompting

Pallavi Jain; Dino Ienco; Roberto Interdonato; Tristan Berchoux; Diego Marcos

2025 WACV WACV 2025

SenCLIP: Enhancing Zero-Shot Land-Use Mapping for Sentinel-2 with Ground-Level Prompting

Abstract

Pre-trained vision-language models (VLMs) such as CLIP demonstrate impressive zero-shot classification capabilities with free-form prompts and even show some generalization in specialized domains. However their performance on satellite imagery is limited due to the under representation of such data in their training sets which predominantly consist of ground-level images. Existing prompting techniques for satellite imagery are often restricted to generic phrases like "a satellite image of..." limiting their effectiveness for zero-shot land-use/land-cover (LULC) mapping. To address these challenges we introduce SenCLIP which transfers CLIP's representation to Sentinel-2 imagery by leveraging a large dataset of Sentinel-2 images paired with geotagged ground-level photos from across Europe. We evaluate SenCLIP alongside other state-of-the-art remote sensing VLMs on zero-shot LULC mapping tasks using the EuroSAT and BigEarthNet datasets with both aerial and ground-level prompting styles. Our approach which aligns ground-level representations with satellite imagery demonstrates significant improvements in classification accuracy across both prompt styles opening new possibilities for applying free-form textual descriptions in zero-shot LULC mapping. Code dataset and pretrained models are available at https://github.com/pallavijain-pj/SenCLIP

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🧭 Keyword Pioneer — land-use mapping

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Pallavi Jain , Dino Ienco , Roberto Interdonato , Tristan Berchoux , Diego Marcos

Topics

Artificial Intelligence > Core AI > Multimodal Learning Machine Learning > Application Areas > Domain Adaptation Artificial Intelligence > Learning Paradigms > Zero-Shot Learning

Keywords

cross-modal transfer vision-language model representation alignment zero-shot classification satellite imagery land-use mapping

Download PDF

Related papers

Neural Graph Map: Dense Mapping with Efficient Loop Closure Integration 2025

ELMGS: Enhancing Memory and Computation Scalability through Compression for 3D Gaussian Splatting 2025

Feature Fusion Transferability Aware Transformer for Unsupervised Domain Adaptation 2025

Uncertainty-Aware Online Extrinsic Calibration: A Conformal Prediction Approach 2025

Disentangling Spatio-Temporal Knowledge for Weakly Supervised Object Detection and Segmentation in Surgical Video 2025