Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Core AI
Artificial Intelligence
›
Core AI
›
Multi-Modal Learning
1457 directly classified papers
Papers per year
2011: 1
2013: 4
2014: 3
2015: 3
2016: 9
2017: 11
2018: 27
2019: 61
2020: 109
2021: 87
2022: 153
2023: 213
2024: 391
2025: 384
2026: 1
Papers
A Global Occlusion-Aware Approach to Self-Supervised Monocular Visual Odometry
AAAI 2021
Structured Co-reference Graph Attention for Video-grounded Dialogue
AAAI 2021
Quantum Cognitively Motivated Decision Fusion for Video Sentiment Analysis
AAAI 2021
Fashion Focus: Multi-modal Retrieval System for Video Commodity Localization in E-commerce
AAAI 2021
Reasoning with Multimodal Sarcastic Tweets via Modeling Cross-Modality Contrast and Semantic Association
ACL 2020
CH-SIMS: A Chinese Multimodal Sentiment Analysis Dataset with Fine-grained Annotation of Modality
ACL 2020
Does my multimodal model learn cross-modal interactions? It’s harder to tell than you might think!
EMNLP 2020
Learning to Represent Image and Text with Denotation Graph
EMNLP 2020
PEIA: Personality and Emotion Integrated Attentive Model for Music Recommendation on Social Media Platforms
AAAI 2020
Cross-Modal Attention Network for Temporal Inconsistent Audio-Visual Event Localization
AAAI 2020
Cross-View Tracking for Multi-Human 3D Pose Estimation at Over 100 FPS
CVPR 2020
Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition
CVPR 2020
Searching for Actions on the Hyperbole
CVPR 2020
Cross-Modality Person Re-Identification With Shared-Specific Feature Transfer
CVPR 2020
An End-to-End Visual-Audio Attention Network for Emotion Recognition in User-Generated Videos
AAAI 2020
DeepDualMapper: A Gated Fusion Network for Automatic Map Extraction Using Aerial Images and Trajectories
AAAI 2020
M3ER: Multiplicative Multimodal Emotion Recognition using Facial, Textual, and Speech Cues
AAAI 2020
Just Ask: An Interactive Learning Framework for Vision and Language Navigation
AAAI 2020
ManyModalQA: Modality Disambiguation and QA over Diverse Inputs
AAAI 2020
Visual Agreement Regularized Training for Multi-Modal Machine Translation
AAAI 2020
Learning Long- and Short-Term User Literal-Preference with Multimodal Hierarchical Transformer Network for Personalized Image Caption
AAAI 2020
Expressing Objects Just Like Words: Recurrent Visual Embedding for Image-Text Matching
AAAI 2020
Show, Recall, and Tell: Image Captioning with Recall Mechanism
AAAI 2020
Federated Learning for Vision-and-Language Grounding Problems
AAAI 2020
DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual Dialogue
AAAI 2020
<
1
…
49
50
51
…
59
>