2025 WACV WACV 2025

TFM^2: Training-Free Mask Matching for Open-Vocabulary Semantic Segmentation

Abstract

The potential of Open-Vocabulary Semantic Segmentation (OVSS) in few-shot scenarios is not fully explored due to the complexity of extending few-shot concepts to semantic segmentation tasks. To address this challenge we propose Training-Free Mask Matching (TFM^2) an efficient mask-based adapter method that enhances OVSS models for the few-shot open vocabulary semantic segmentation task. TFM^2 is a key-value cache that explicitly designed for image masks. We introduce three modules to construct and refine the mask cache subsequently enhancing the OVSS mask classification performance. Comprehensive experiments demonstrate that TFM^2 improves the performance of state-of-the-art OVSS methods by a margin of 1% to 5% across different settings. Moreover TFM^2 is not limited to any specific methods or backbones. This work underscores the importance and potential of few-shot data in OVSS and presents a significant step toward leveraging this potential.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Machine Learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio