Norm-guided Adaptive Visual Embedding for Zero-Shot Sketch-Based Image Retrieval

Wenjie Wang; Yufeng Shi; Shiming Chen; Qinmu Peng; Feng Zheng; Xinge You

2021 IJCAI IJCAI 2021

Norm-guided Adaptive Visual Embedding for Zero-Shot Sketch-Based Image Retrieval

Abstract

Zero-shot sketch-based image retrieval (ZS-SBIR), which aims to retrieve photos with sketches under the zero-shot scenario, has shown extraordinary talents in real-world applications. Most existing methods leverage language models to generate class-prototypes and use them to arrange the locations of all categories in the common space for photos and sketches. Although great progress has been made, few of them consider whether such pre-defined prototypes are necessary for ZS-SBIR, where locations of unseen class samples in the embedding space are actually determined by visual appearance and a visual embedding actually performs better. To this end, we propose a novel Norm-guided Adaptive Visual Embedding (NAVE) model, for adaptively building the common space based on visual similarity instead of language-based pre-defined prototypes. To further enhance the representation quality of unseen classes for both photo and sketch modality, modality norm discrepancy and noisy label regularizer are jointly employed to measure and repair the modality bias of the learned common embedding. Experiments on two challenging datasets demonstrate the superiority of our NAVE over state-of-the-art competitors.

🌉 Interdisciplinary Bridge — Computer Vision and Machine Learning

🧭 Keyword Pioneer — modality norm discrepancy

🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Machine Learning, Natural Language Processing, Speech & Audio

Authors

Wenjie Wang , Yufeng Shi , Shiming Chen , Qinmu Peng , Feng Zheng , Xinge You

Topics

Machine Learning > Core Methods > Embedding Learning Machine Learning > Learning Types > Zero-Shot Learning Computer Vision > Analysis > Object Detection

Keywords

semantic prior visual embedding zero-shot sketch-based image retrieval common space modality norm discrepancy

Download PDF

Related papers

Type Anywhere You Want: An Introduction to Invisible Mobile Keyboard 2021

Guaranteeing Maximin Shares: Some Agents Left Behind 2021

Surprisingly Popular Voting Recovers Rankings, Surprisingly! 2021

Strategyproof Randomized Social Choice for Restricted Sets of Utility Functions 2021

Diversity in Kemeny Rank Aggregation: A Parameterized Approach 2021