Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Core AI
Computer Vision
›
Core AI
›
Multimodal Learning
1257 directly classified papers
Papers per year
2008: 1
2009: 2
2010: 2
2011: 1
2012: 3
2013: 3
2014: 2
2015: 5
2017: 11
2018: 25
2019: 33
2020: 66
2021: 47
2022: 113
2023: 199
2024: 325
2025: 411
2026: 8
Papers
RT3D: Achieving Real-Time Execution of 3D Convolutional Neural Networks on Mobile Devices
AAAI 2021
Supervised Training of Dense Object Nets using Optimal Descriptors for Industrial Robotic Applications
AAAI 2021
A Novel Visual Interpretability for Deep Neural Networks by Optimizing Activation Maps with Perturbation
AAAI 2021
Graph-to-Graph: Towards Accurate and Interpretable Online Handwritten Mathematical Expression Recognition
AAAI 2021
Adversarial Turing Patterns from Cellular Automata
AAAI 2021
Similarity Reasoning and Filtration for Image-Text Matching
AAAI 2021
Adaptive Cross-Modal Prototypes for Cross-Domain Visual-Language Retrieval
CVPR 2021
Hierarchical Graph Attention Network for Few-Shot Visual-Semantic Learning
ICCV 2021
Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting
CVPR 2021
Understanding Object Dynamics for Interactive Image-to-Video Synthesis
CVPR 2021
DeepVideoMVS: Multi-View Stereo on Video With Recurrent Spatio-Temporal Fusion
CVPR 2021
Deep RGB-D Saliency Detection With Depth-Sensitive Attention and Automatic Multi-Modal Fusion
CVPR 2021
T2VLAD: Global-Local Sequence Alignment for Text-Video Retrieval
CVPR 2021
Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation
CVPR 2021
Vision Matters When It Should: Sanity Checking Multimodal Machine Translation Models
EMNLP 2021
Beyond Accuracy: A Consolidated Tool for Visual Question Answering Benchmarking
EMNLP 2021
Learning to Ground Visual Objects for Visual Dialog
EMNLP 2021
VisualSem: a high-quality knowledge graph for vision and language
EMNLP 2021
RGB-D Saliency Detection via Cascaded Mutual Information Minimization
ICCV 2021
VisualMRC: Machine Reading Comprehension on Document Images
AAAI 2021
PoseBlocks: A Toolkit for Creating (and Dancing) with AI
AAAI 2021
Visual Scene Graphs for Audio Source Separation
ICCV 2021
Probabilistic Embeddings for Cross-Modal Retrieval
CVPR 2021
Visually Informed Binaural Audio Generation without Binaural Audios
CVPR 2021
YNU-HPCC at SemEval-2021 Task 6: Combining ALBERT and Text-CNN for Persuasion Detection in Texts and Images
ACL 2021
<
1
…
43
44
45
…
51
>