2026 WACV WACV 2026

MUSE: Model-based Uncertainty-aware Similarity Estimation for zero-shot 2D Object Detection and Segmentation

Abstract

In this work, we present MUSE (Model-based Uncertainty-aware Similarity Estimation), a training-free framework for model-based zero-shot 2D object detection and segmentation. First, MUSE incorporates 2D multi-view templates from 3D unseen objects and 2D object proposals from the input query image, respectively. In the embedding stage, we propose a new feature embedding scheme which integrates class and patch embeddings. Specifically, the patch embeddings are normalized using the generalized mean pooling (GeM). In the matching stage, a joint similarity score is introduced, which integrates an absolute score and a relative score. Finally, we update the similarity score using an uncertainty-aware object prior. MUSE achieves state-of-the-art performance on the BOP Challenge 2025, ranking first in the Classic Core, H3, and Industrial tracks--without any additional training or fine-tuning. Therefore, we believe that MUSE is a promising framework for zero-shot 2D object detection and segmentation.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Machine Learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio