Ranking and Retrieval of Image Sequences From Multiple Paragraph Queries

Gunhee Kim; Seungwhan Moon; Leonid Sigal

2015 CVPR CVPR 2015

Ranking and Retrieval of Image Sequences From Multiple Paragraph Queries

Abstract

We propose a method to rank and retrieve image sequences from a natural language text query, consisting of multiple sentences or paragraphs. One of the method's key applications is to visualize visitors' text-only reviews on TRIPADVISOR or YELP, by automatically retrieving the most illustrative image sequences. While most previous work has dealt with the relations between a natural language sentence and an image or a video, our work extends to the relations between paragraphs and image sequences. Our approach leverages the vast user-generated resource of blog posts and photo streams on the Web. We use blog posts as text-image parallel training data that co-locate informative text with representative images that are carefully selected by users. We exploit large-scale photo streams to augment the image samples for retrieval. We design a latent structural SVM framework to learn the semantic relevance relations between text and image sequences. We present both quantitative and qualitative results on the newly created DISNEYLAND dataset.

🌉 Interdisciplinary Bridge — Computer Science and Computer Vision and Deep Learning

📈 Trend Setter — Document Analysis

🧭 Keyword Pioneer — text-image matching

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Gunhee Kim , Seungwhan Moon , Leonid Sigal

Topics

Computer Science > Applications > Information Retrieval Computer Science > Applications > Document Analysis Deep Learning > Learning Types > Multi-Modal Learning Computer Vision > Processing > Image Retrieval

Keywords

image retrieval structural svm image sequence natural language query text-image matching semantic relevance latent structural svm

Download PDF

Related papers

Long-Term Correlation Tracking 2015

Hierarchically-Constrained Optical Flow 2015

Propagated Image Filtering 2015

Web Scale Photo Hash Clustering on A Single Machine 2015

Expanding Object Detector's Horizon: Incremental Learning Framework for Object Detection in Videos 2015