Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Core AI
Artificial Intelligence
›
Core AI
›
Multimodal Learning
13057 directly classified papers
Papers per year
2003: 1
2006: 3
2007: 6
2008: 2
2009: 5
2010: 2
2011: 3
2012: 6
2013: 24
2014: 20
2015: 46
2016: 109
2017: 205
2018: 299
2019: 622
2020: 675
2021: 987
2022: 1084
2023: 1697
2024: 2500
2025: 3654
2026: 1107
Papers
Optical Character Recognition for the International Phonetic Alphabet
EACL 2026
MedPEFT-CL: Dual-Phase Parameter-Efficient Continual Learning with Medical Semantic Adapter and Bidirectional Memory Consolidation
WACV 2026
Action Anticipation at a Glimpse: To What Extent Can Multimodal Cues Replace Video?
WACV 2026
Similarity-aware Probabilistic Embeddings Modeling for Video-Text Retrieval
WACV 2026
Human Knowledge Integrated Multi-modal Learning for Single Source Domain Generalization
WACV 2026
UniCalib: Targetless LiDAR-camera Calibration via Probabilistic Flow on Unified Depth Representations
WACV 2026
Beyond the Highlights: Video Retrieval with Salient and Surrounding Contexts
WACV 2026
Analysis of Text Accuracy and Visual Alignment in Vision-Language Models for Artistic Text Generation
WACV 2026
Sketch2Stitch: GANs for Abstract Sketch-Based Dress Synthesis
WACV 2026
GateFusion: Hierarchical Gated Cross-Modal Fusion for Active Speaker Detection
WACV 2026
FairVLM: Enhancing Fairness and Prompt Sensitivity in Vision Language Models for Medical Image Segmentation
WACV 2026
V2XScene: Multi-View Consistent 3D Scene Simulation for Collaborative Perception
WACV 2026
CAPE: A CLIP-Aware Pointing Ensemble of Complementary Heatmap Cues for Embodied Reference Understanding
WACV 2026
Large Sign Language Models: Toward 3D American Sign Language Translation
WACV 2026
PoseGaussian: Pose-Driven Novel View Synthesis for Robust 3D Human Reconstruction
WACV 2026
The Correlation Between Emotion in Text and Speech Segments is Limited: A Cross-Modal Study
EACL 2026
ArchitectHead: Continuous Level of Detail Control for 3D Gaussian Head Avatars
WACV 2026
SAFER-AiD: Saccade-Assisted Foveal-peripheral vision Enhanced Reconstruction for Adversarial Defense
WACV 2026
AuthGuard: Generalizable Deepfake Detection via Language Guidance
WACV 2026
Improving Out-of-Distribution Detection Using Segmented Images and Cross-View Attention Fusion
WACV 2026
Crafting Adversarial Inputs for Large Vision-Language Models Using Black-Box Optimization
EACL 2026
What Happens When: Learning Temporal Orders of Events in Videos
WACV 2026
SENCA-st: Integrating Spatial Transcriptomics and Histopathology with Cross Attention Shared Encoder for Region Identification in Cancer Pathology
WACV 2026
Extending Audio Context for Long-Form Understanding in Large Audio-Language Models
EACL 2026
UNO: Unifying One-stage Video Scene Graph Generation via Object-Centric Visual Representation Learning
WACV 2026
<
1
…
5
6
7
…
523
>