Learning Zero-Shot Multifaceted Visually Grounded Word Embeddings via Multi-Task Training

Hassan Shahmohammadi; Hendrik P. A. Lensch; R. Harald Baayen

2021 EMNLP EMNLP 2021

Learning Zero-Shot Multifaceted Visually Grounded Word Embeddings via Multi-Task Training

Abstract

AbstractLanguage grounding aims at linking the symbolic representation of language (e.g., words) into the rich perceptual knowledge of the outside world. The general approach is to embed both textual and visual information into a common space -the grounded space- confined by an explicit relationship. We argue that since concrete and abstract words are processed differently in the brain, such approaches sacrifice the abstract knowledge obtained from textual statistics in the process of acquiring perceptual information. The focus of this paper is to solve this issue by implicitly grounding the word embeddings. Rather than learning two mappings into a joint space, our approach integrates modalities by implicit alignment. This is achieved by learning a reversible mapping between the textual and the grounded space by means of multi-task training. Intrinsic and extrinsic evaluations show that our way of visual grounding is highly beneficial for both abstract and concrete words. Our embeddings are correlated with human judgments and outperform previous works using pretrained word embeddings on a wide range of benchmarks. Our grounded embeddings are publicly available here.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — implicit alignment

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Hassan Shahmohammadi , Hendrik P. A. Lensch , R. Harald Baayen

Topics

Artificial Intelligence > Core AI > Multimodal Learning Artificial Intelligence > Learning Paradigms > Few-Shot Learning Machine Learning > Core Methods > Embedding Learning Deep Learning > Architectures > Neural Networks Natural Language Processing > Resources & Methods > Text Representation Natural Language Processing > Resources & Methods > Transfer Learning Machine Learning > Core Methods > Multi-Task Learning Machine Learning > Learning Paradigms > Zero-Shot Learning Artificial Intelligence > Learning Paradigms > Multi-Task Learning Machine Learning > Learning Paradigms > Multi-Task Learning Deep Learning > Learning Types > Multi-Task Learning Artificial Intelligence > Core AI > Multi-Modal Learning

Keywords

zero-shot learning visual grounding language grounding word embedding multi-task training visually grounded word embedding implicit alignment implicit grounding multifaceted embedding textual statistics perceptual knowledge abstract word

Download PDF

Related papers

Continual Learning in Multilingual NMT via Language-Specific Embeddings 2021

MultiDoc2Dial: Modeling Dialogues Grounded in Multiple Documents 2021

Efficient Multi-Task Auxiliary Learning: Selecting Auxiliary Data by Feature Similarity 2021

Neural Machine Translation with Heterogeneous Topic Knowledge Embeddings 2021

Semantics-Preserved Data Augmentation for Aspect-Based Sentiment Analysis 2021