2022
AAAI
AAAI 2022
Knowledge-Enhanced Scene Graph Generation with Multimodal Relation Alignment (Student Abstract)
Abstract
Abstract Existing scene graph generation methods suffer the limitations when the image lacks of sufficient visual contexts. To address this limitation, we propose a knowledge-enhanced scene graph generation model with multimodal relation alignment, which supplements the missing visual contexts by well-aligned textual knowledge. First, we represent the textual information into contextualized knowledge which is guided by the visual objects to enhance the contexts. Furthermore, we align the multimodal relation triplets by co-attention module for better semantics fusion. The experimental results show the effectiveness of our method.
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Computer Vision and Machine Learning
🧭
Keyword Pioneer
— multimodal relation alignment
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Topics
Machine Learning > Core Methods > Representation Learning
Computer Vision > Analysis > Scene Understanding
Computer Vision > Generation > Image Generation
Computer Vision > Core AI > Multimodal Learning
Machine Learning > Learning Types > Multi-Modal Learning
Artificial Intelligence > Core AI > Knowledge Graph
Artificial Intelligence > Core AI > Multi-Modal Learning