Relation Also Need Attention: Integrating Relation Information Into Image Captioning

Tianyu Chen; Zhixin Li; Tiantao Xian; Canlong Zhang; Huifang Ma

2021 ACML ACML 2021

Relation Also Need Attention: Integrating Relation Information Into Image Captioning

Abstract

Image captioning methods with attention mechanism are leading this field, especially models with global and local attention. But there are few conventional models to integrate the relationship information between various regions of the image. In this paper, this kind of relationship features are embedded into the fused attention mechanism to explore the internal visual and semantic relations between different object regions. Besides, to alleviate the exposure bias problem and make the training process more efficient, we combine Generative Adversarial Network with Reinforcement Learning and employ the greedy decoding method to generate a dynamic baseline reward for self-critical training. Finally, experiments on MSCOCO datasets show that the model can generate more accurate and vivid image captioning sentences and perform better in multiple prevailing metrics than the previous advanced models.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Reinforcement Learning

🧭 Keyword Pioneer — relation information

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Tianyu Chen , Zhixin Li , Tiantao Xian , Canlong Zhang , Huifang Ma

Topics

Artificial Intelligence > Core AI > Multimodal Learning Machine Learning > Learning Types > Adversarial Learning Reinforcement Learning > Methods > Deep RL

Keywords

reinforcement learning attention mechanism image captioning generative adversarial network relation information

Download PDF

Related papers

Transfer Learning with Adaptive Online TrAdaBoost for Data Streams 2021

$h$-DBSCAN: A simple fast DBSCAN algorithm for big data 2021

Iterative Deep Model Compression and Acceleration in the Frequency Domain 2021

Max-Utility Based Arm Selection Strategy For Sequential Query Recommendations 2021

Contrastive Neural Processes for Self-Supervised Learning 2021