2017
EMNLP
EMNLP 2017
Hierarchically-Attentive RNN for Album Summarization and Storytelling
Abstract
AbstractWe address the problem of end-to-end visual storytelling. Given a photo album, our model first selects the most representative (summary) photos, and then composes a natural language story for the album. For this task, we make use of the Visual Storytelling dataset and a model composed of three hierarchically-attentive Recurrent Neural Nets (RNNs) to: encode the album photos, select representative (summary) photos, and compose the story. Automatic and human evaluations show our model achieves better performance on selection, generation, and retrieval than baselines.
🌉
Interdisciplinary Bridge
— Computer Vision and Deep Learning and Machine Learning and Natural Language Processing
🐣
Hot Topic Early Bird
— hierarchical attention
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Topics
Machine Learning > Core Methods > Representation Learning
Machine Learning > Learning Types > Self-Supervised Learning
Deep Learning > Architectures > Transformers
Computer Vision > Generation > Image Captioning
Natural Language Processing > Generation > Text Generation
Deep Learning > Architectures > Recurrent Neural Networks