RETRACTED: Multi-modal Graph Fusion for Named Entity Recognition with Targeted Visual Guidance

Dong Zhang; Suzhong Wei; Shoushan Li; Hanqian Wu; Qiaoming Zhu; Guodong Zhou

2021 AAAI AAAI 2021

RETRACTED: Multi-modal Graph Fusion for Named Entity Recognition with Targeted Visual Guidance

Abstract

Abstract Multi-modal named entity recognition (MNER) aims to discover named entities in free text and classify them into pre-defined types with images. However, dominant MNER models do not fully exploit fine-grained semantic correspondences between semantic units of different modalities, which have the potential to refine multi-modal representation learning. To deal with this issue, we propose a unified multi-modal graph fusion (UMGF) approach for MNER. Specifically, we first represent the input sentence and image using a unified multi-modal graph, which captures various semantic relationships between multi-modal semantic units (words and visual objects). Then, we stack multiple graph-based multi-modal fusion layers that iteratively perform semantic interactions to learn node representations. Finally, we achieve an attention-based multi-modal representation for each word and perform entity labeling with a CRF decoder. Experimentation on the two benchmark datasets demonstrates the superiority of our MNER model.Editorial NotesThis article, which was published in Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI 2021), has been retracted by agreement between the authors and AAAI.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — multi-modal named entity recognition

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Robotics, Speech & Audio

Authors

Dong Zhang , Suzhong Wei , Shoushan Li , Hanqian Wu , Qiaoming Zhu , Guodong Zhou

Topics

Deep Learning > Architectures > Graph Neural Networks Deep Learning > Models > Generative Models Natural Language Processing > Understanding > Named Entity Recognition Machine Learning > Core Methods > Graph Neural Networks Deep Learning > Learning Types > Multi-Modal Learning Artificial Intelligence > Core AI > Information Extraction

Keywords

semantic correspondence graph fusion entity labeling multi-modal representation learning multi-modal named entity recognition visual guidance attention representation

Download PDF

Related papers

Contextual Conditional Reasoning 2021

Attention Beam: An Image Captioning Approach (Student Abstract) 2021

Movie Summarization via Sparse Graph Construction 2021

Text Analysis for Understanding Symptoms of Social Anxiety in Student Veterans 2021

ERNIE-ViL: Knowledge Enhanced Vision-Language Representations through Scene Graphs 2021