Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Learning Types
Deep Learning
›
Learning Types
›
Multimodal Learning
323 directly classified papers
Papers per year
2014: 1
2015: 1
2017: 8
2018: 11
2019: 11
2020: 27
2021: 23
2022: 46
2023: 35
2024: 53
2025: 104
2026: 3
Papers
Globetrotter: Connecting Languages by Connecting Images
CVPR 2022
M4Singer: A Multi-Style, Multi-Singer and Musical Score Provided Mandarin Singing Corpus
NIPS 2022
SGEITL: Scene Graph Enhanced Image-Text Learning for Visual Commonsense Reasoning
AAAI 2022
Multimodal Adversarially Learned Inference with Factorized Discriminators
AAAI 2022
UNISON: Unpaired Cross-Lingual Image Captioning
AAAI 2022
D-vlog: Multimodal Vlog Dataset for Depression Detection
AAAI 2022
Towards Multimodal Vision-Language Models Generating Non-generic Text
AAAI 2022
Building Goal-Oriented Dialogue Systems with Situated Visual Context
AAAI 2022
Vision-Language Pre-Training for Multimodal Aspect-Based Sentiment Analysis
ACL 2022
Things not Written in Text: Exploring Spatial Commonsense from Visual Signals
ACL 2022
MSCTD: A Multimodal Sentiment Chat Translation Dataset
ACL 2022
Leveraging Unimodal Self-Supervised Learning for Multimodal Audio-Visual Speech Recognition
ACL 2022
Finding Structural Knowledge in Multimodal-BERT
ACL 2022
What do Models Learn From Training on More Than Text? Measuring Visual Commonsense Knowledge
ACL 2022
Vision-Language Pretraining: Current Trends and the Future
ACL 2022
DuReadervis: A Chinese Dataset for Open-domain Document Visual Question Answering
ACL 2022
Sentiment Word Aware Multimodal Refinement for Multimodal Sentiment Analysis with ASR Errors
ACL 2022
Seeing the advantage: visually grounding word embeddings to better capture human semantic knowledge
ACL 2022
Visually Grounded Interpretation of Noun-Noun Compounds in English
ACL 2022
Combining Language Models and Linguistic Information to Label Entities in Memes
ACL 2022
Detecting the Role of an Entity in Harmful Memes: Techniques and their Limitations
ACL 2022
Fine-tuning and Sampling Strategies for Multimodal Role Labeling of Entities under Class Imbalance
ACL 2022
How does fake news use a thumbnail? CLIP-based Multimodal Detection on the Unrepresentative News Image
ACL 2022
Utilizing Cross-Modal Contrastive Learning to Improve Item Categorization BERT Model
ACL 2022
Can Pretrained Language Models Generate Persuasive, Faithful, and Informative Ad Text for Product Descriptions?
ACL 2022
<
1
…
8
9
10
…
13
>