ReME: A Data-Centric Framework for Training-Free Open-Vocabulary Segmentation

Xiwei Xuan; Ziquan Deng; Kwan-Liu Ma

2025 ICCV ICCV 2025

ReME: A Data-Centric Framework for Training-Free Open-Vocabulary Segmentation

Abstract

Training-free open-vocabulary semantic segmentation (OVS) aims to segment images given a set of arbitrary textual categories without costly model fine-tuning. Existing solutions often explore attention mechanisms of pre-trained models, such as CLIP, or generate synthetic data and design complex retrieval processes to perform OVS. However, their performance is limited by the capability of reliant models or the suboptimal quality of reference sets. In this work, we investigate the largely overlooked data quality problem for this challenging dense scene understanding task, and identify that a high-quality reference set can significantly benefit training-free OVS. With this observation, we introduce a data-quality-oriented framework, comprising a data pipeline to construct a reference set with well-paired segment-text embeddings and a simple similarity-based retrieval to unveil the essential effect of data. Remarkably, extensive evaluations on ten benchmark datasets demonstrate that our method outperforms all existing training-free OVS approaches, highlighting the importance of data-centric design for advancing OVS without training.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Machine Learning

🧭 Keyword Pioneer — similarity-based retrieval

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Xiwei Xuan , Ziquan Deng , Kwan-Liu Ma

Topics

Artificial Intelligence > Learning Paradigms > Transfer Learning Machine Learning > Application Areas > Domain Adaptation Computer Vision > Processing > Semantic Segmentation Machine Learning > Learning Paradigms > Transfer Learning

Keywords

semantic segmentation reference sets open-vocabulary segmentation feature retrieval text embedding clip model data-centric design data-centric learning similarity-based retrieval

Download PDF

Related papers

MA-CIR: A Multimodal Arithmetic Benchmark for Composed Image Retrieval 2025

SimMLM: A Simple Framework for Multi-modal Learning with Missing Modality 2025

MonSTeR: a Unified Model for Motion, Scene, Text Retrieval 2025

ASGS: Single-Domain Generalizable Open-Set Object Detection via Adaptive Subgraph Searching 2025

Robust Dataset Condensation using Supervised Contrastive Learning 2025