Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Core AI
Artificial Intelligence
›
Core AI
›
Multimodal Learning
13057 directly classified papers
Papers per year
2003: 1
2006: 3
2007: 6
2008: 2
2009: 5
2010: 2
2011: 3
2012: 6
2013: 24
2014: 20
2015: 46
2016: 109
2017: 205
2018: 299
2019: 622
2020: 675
2021: 987
2022: 1084
2023: 1697
2024: 2500
2025: 3654
2026: 1107
Papers
Controllable Long-term Motion Generation with Extended Joint Targets
WACV 2026
Action Anticipation at a Glimpse: To What Extent Can Multimodal Cues Replace Video?
WACV 2026
Model-free Domain Adaptation for Concealed Multimodal Large-Language Models
WACV 2026
Hybrid State Representation for Video Procedure Planning
WACV 2026
LASER: Lip Landmark Assisted Speaker Detection for Robustness
WACV 2026
Sea-CLIP: Mining Semantic-Aware Representations for Few-Shot Anomaly Detection with CLIP
WACV 2026
Boosting Medical Vision-Language Pretraining via Momentum Self-Distillation under Limited Computing Resources
WACV 2026
milliMamba: Specular-Aware Human Pose Estimation via Dual mmWave Radar with Multi-Frame Mamba Fusion
WACV 2026
LangPose: Language-Aligned Motion for Robust 3D Human Pose Estimation
WACV 2026
Streaming Real-Time Trajectory Prediction Using Endpoint-Aware Modeling
WACV 2026
Multi-Modal Soccer Scene Analysis with Masked Pre-Training
WACV 2026
mmWEAVER: Environment-Specific mmWave Signal Synthesis from a Photo and Activity Description
WACV 2026
IMPACT: Interpretable Most Important Person Analysis and Classification using Transformer-based Models
WACV 2026
HOLO: Holistic Lightweight Optimization for Scene Understanding with Auto-Annotation and Multimodal Learning
WACV 2026
DenseBEV: Transforming BEV Grid Cells into 3D Objects
WACV 2026
CLIP-IT: CLIP-based Pairing of Histology Images with Privileged Textual Information
WACV 2026
Towards Unconstrained Cross-View Pose Estimation
WACV 2026
Anatomy-VLM: A Fine-grained Vision-Language Model for Medical Interpretation
WACV 2026
Cross-Modal Event Encoder: Bridging Image-Text Knowledge to Event Streams
WACV 2026
Dual-Domain Multimodal Hyperbolic Fusion for Cardiopulmonary Disease Diagnosis in Emergency Care
WACV 2026
Training-Free Few-Shot Segmentation via Vision-Language Guided Prompting
WACV 2026
Ordinal-Aware Multimodal Engagement Recognition for Collaborative Learning
WACV 2026
CVP: Central-Peripheral Vision-Inspired Multimodal Model for Spatial Reasoning
WACV 2026
Fused Similarity Measure Based Alignment with Dual-Scale Adaptive Selection for Weakly Supervised Video Anomaly Detection
WACV 2026
AnyAnomaly: Zero-Shot Customizable Video Anomaly Detection with LVLM
WACV 2026
<
1
2
3
4
5
…
523
>