Cross-modal Representation Learning and Relation Reasoning for Bidirectional Adaptive Manipulation

Lei Li; Kai Fan; Chun Yuan

2022 IJCAI IJCAI 2022

Cross-modal Representation Learning and Relation Reasoning for Bidirectional Adaptive Manipulation

Abstract

Since single-modal controllable manipulation typically requires supervision of information from other modalities or cooperation with complex software and experts, this paper addresses the problem of cross-modal adaptive manipulation (CAM). The novel task performs cross-modal semantic alignment from mutual supervision and implements bidirectional exchange of attributes, relations, or objects in parallel, benefiting both modalities while significantly reducing manual effort. We introduce a robust solution for CAM, which includes two essential modules, namely Heterogeneous Representation Learning (HRL) and Cross-modal Relation Reasoning (CRR). The former is designed to perform representation learning for cross-modal semantic alignment on heterogeneous graph nodes. The latter is adopted to identify and exchange the focused attributes, relations, or objects in both modalities. Our method produces pleasing cross-modal outputs on CUB and Visual Genome.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning

🧭 Keyword Pioneer — cross-modal representation learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Deep Learning, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing

Authors

Lei Li , Kai Fan , Chun Yuan

Topics

Artificial Intelligence > Core AI > Multimodal Learning Machine Learning > Core Methods > Representation Learning Deep Learning > Architectures > Graph Neural Networks Machine Learning > Learning Types > Multi-Modal Learning Artificial Intelligence > Core AI > Knowledge Graph

Keywords

semantic alignment relation reasoning heterogeneous graph cross-modal representation learning adaptive manipulation bidirectional adaptive manipulation

Download PDF

Related papers

Better Collective Decisions via Uncertainty Reduction 2022

Mixed Strategies for Security Games with General Defending Requirements 2022

Achieving Envy-Freeness with Limited Subsidies under Dichotomous Valuations 2022

Distortion in Voting with Top-t Preferences 2022

Let’s Agree to Agree: Targeting Consensus for Incomplete Preferences through Majority Dynamics 2022