Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Learning Types
Machine Learning
›
Learning Types
›
Multi-Modal Learning
1213 directly classified papers
Papers per year
2007: 2
2008: 1
2009: 1
2011: 2
2012: 5
2013: 5
2014: 1
2015: 5
2016: 8
2017: 21
2018: 42
2019: 42
2020: 69
2021: 72
2022: 149
2023: 143
2024: 258
2025: 370
2026: 17
Papers
Autoregressive Styled Text Image Generation, but Make it Reliable
WACV 2026
BiNAR: A Bi-Modal Framework for Non-Aligned RGB-IR 3D Reconstruction via Gaussian Splatting
WACV 2026
MuSACo: Multimodal Subject-Specific Selection and Adaptation for Expression Recognition with Co-Training
WACV 2026
View-aware Cross-modal Distillation for Multi-view Action Recognition
WACV 2026
HOLO: Holistic Lightweight Optimization for Scene Understanding with Auto-Annotation and Multimodal Learning
WACV 2026
ScoliGaitX: A Deep Multi-Modal Fusion Network for Scoliosis Assessment via Gait Video Analysis
WACV 2026
Ordinal-Aware Multimodal Engagement Recognition for Collaborative Learning
WACV 2026
4D Multimodal Co-attention Fusion Network with Latent Contrastive Alignment for Alzheimer's Diagnosis
WACV 2026
PaRaChute: Pathology-Radiology Cross-Modal Fusion for Missing-Modality-Robust Survival Prediction
WACV 2026
Robust Multimodal Emotion Recognition from Incomplete Modalities via Query-Based Unimodal and Cross-Modal Learning
WACV 2026
BAFIS: Dataset + Framework to Assess Occupational Bias and Human Preference in Modern Text-to-image Models
WACV 2026
Grounding Degradations in Natural Language for All-In-One Video Restoration
WACV 2026
Countering Multi-modal Representation Collapse through Rank-targeted Fusion
WACV 2026
Multimodal Medical Image Binding via Shared Text Embeddings
WACV 2026
SceneEdited: A City-Scale Benchmark for 3D HD Map Updating via Image-Guided Change Detection
WACV 2026
Dual-Domain Multimodal Hyperbolic Fusion for Cardiopulmonary Disease Diagnosis in Emergency Care
WACV 2026
AuViRe: Audio-visual Speech Representation Reconstruction for Deepfake Temporal Localization
WACV 2026
QUPID: Quantified Understanding for Enhanced Performance, Insights, and Decisions in Korean Search Engines
ACL 2025
COVTrack: Continuous Open-Vocabulary Tracking via Adaptive Multi-Cue Fusion
ICCV 2025
MICE: Mixture of Image Captioning Experts Augmented e-Commerce Product Attribute Value Extraction
ACL 2025
Performance Gap in Entity Knowledge Extraction Across Modalities in Vision Language Models
ACL 2025
Proxy-Driven Robust Multimodal Sentiment Analysis with Incomplete Data
ACL 2025
Program Synthesis Benchmark for Visual Programming in XLogoOnline Environment
ACL 2025
FCMR: Robust Evaluation of Financial Cross-Modal Multi-Hop Reasoning
ACL 2025
HintsOfTruth: A Multimodal Checkworthiness Detection Dataset with Real and Synthetic Claims
ACL 2025
<
1
2
3
4
5
…
49
>