Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Learning Types
Deep Learning
›
Learning Types
›
Multi-Modal Learning
3194 directly classified papers
Papers per year
2003: 1
2010: 1
2011: 1
2013: 5
2014: 3
2015: 9
2016: 23
2017: 49
2018: 78
2019: 158
2020: 223
2021: 261
2022: 354
2023: 471
2024: 705
2025: 835
2026: 17
Papers
Visual Question Generation as Dual Task of Visual Question Answering
CVPR 2018
FlipDial: A Generative Model for Two-Way Visual Dialogue
CVPR 2018
Separating Self-Expression and Visual Content in Hashtag Supervision
CVPR 2018
RotationNet: Joint Object Categorization and Pose Estimation Using Multiviews From Unsupervised Viewpoints
CVPR 2018
Multimodal Visual Concept Learning With Weakly Supervised Techniques
CVPR 2018
VizWiz Grand Challenge: Answering Visual Questions From Blind People
CVPR 2018
Visual to Sound: Generating Natural Sound for Videos in the Wild
CVPR 2018
Progressively Complementarity-Aware Fusion Network for RGB-D Salient Object Detection
CVPR 2018
Stacked Latent Attention for Multimodal Reasoning
CVPR 2018
Learning Rich Features for Image Manipulation Detection
CVPR 2018
MiCT: Mixed 3D/2D Convolutional Tube for Human Action Recognition
CVPR 2018
Cross-Modal Deep Variational Hand Pose Estimation
CVPR 2018
Deep Learning Based Multi-modal Addressee Recognition in Visual Scenes with Utterances
IJCAI 2018
Audio-visual Voice Conversion Using Deep Canonical Correlation Analysis for Deep Bottleneck Features
INTERSPEECH 2018
Language Identification in Code-Mixed Data using Multichannel Neural Networks and Context Capture
EMNLP 2018
Contextual Inter-modal Attention for Multi-modal Sentiment Analysis
EMNLP 2018
LRMM: Learning to Recommend with Missing Modalities
EMNLP 2018
Cross-lingual Decompositional Semantic Parsing
EMNLP 2018
Neural Multitask Learning for Simile Recognition
EMNLP 2018
SemStyle: Learning to Generate Stylised Image Captions Using Unaligned Text
CVPR 2018
Improved Fusion of Visual and Language Representations by Dense Symmetric Co-Attention for Visual Question Answering
CVPR 2018
Learning to Localize Sound Source in Visual Scenes
CVPR 2018
Learning Translations via Images with a Massively Multilingual Image Dataset
ACL 2018
Investigating Audio, Video, and Text Fusion Methods for End-to-End Automatic Personality Prediction
ACL 2018
Connecting Language and Vision to Actions
ACL 2018
<
1
…
123
124
125
…
128
>