Sense Discovery via Co-Clustering on Images and Text

Xinlei Chen; Alan Ritter; Abhinav Gupta; Tom Mitchell

2015 CVPR CVPR 2015

Sense Discovery via Co-Clustering on Images and Text

Abstract

We present a co-clustering framework that can be used to discover multiple semantic and visual senses of a given Noun Phrase (NP). Unlike traditional clustering approaches which assume a one-to-one mapping between the clusters in the text-based feature space and the visual space, we adopt a one-to-many mapping between the two spaces. This is primarily because each semantic sense (concept) can correspond to different visual senses due to viewpoint and appearance variations. Our structure-EM style optimization not only extracts the multiple senses in both semantic and visual feature space, but also discovers the mapping between the senses. We introduce a challenging dataset (CMU Polysemy-30) for this problem consisting of 30 NPs ($\sim$5600 labeled instances out of $\sim$22K total instances). We have also conducted a large-scale experiment that performs sense disambiguation for $\sim$2000 NPs.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision

📈 Trend Setter — Multimodal Learning

🧭 Keyword Pioneer — noun phrase

🐣 Hot Topic Early Bird — multimodal learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Xinlei Chen , Alan Ritter , Abhinav Gupta , Tom Mitchell

Topics

Artificial Intelligence > Core AI > Multimodal Learning Computer Vision > Core AI > Multimodal Learning Artificial Intelligence > Core AI > Multi-Modal Learning

Keywords

multimodal learning noun phrase sense disambiguation vision language co clustering sense discovery noun phrase disambiguation cross modal mapping visual sense semantic sense

Download PDF

Related papers

Long-Term Correlation Tracking 2015

Hierarchically-Constrained Optical Flow 2015

Propagated Image Filtering 2015

Web Scale Photo Hash Clustering on A Single Machine 2015

Expanding Object Detector's Horizon: Incremental Learning Framework for Object Detection in Videos 2015