Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Core AI
Artificial Intelligence
›
Core AI
›
Multimodal Learning
13057 directly classified papers
Papers per year
2003: 1
2006: 3
2007: 6
2008: 2
2009: 5
2010: 2
2011: 3
2012: 6
2013: 24
2014: 20
2015: 46
2016: 109
2017: 205
2018: 299
2019: 622
2020: 675
2021: 987
2022: 1084
2023: 1697
2024: 2500
2025: 3654
2026: 1107
Papers
A Compliance-Preserving Retrieval System for Aircraft MRO Task Search
EACL 2026
code-transformed: The Influence of Large Language Models on Code
EACL 2026
QueStER: Query Specification for Generative Keyword-Based Retrieval
EACL 2026
DCSN-NLP at MWE-2026 AdMIRe 2: Bridging Literal and Figurative Meaning Through Hierarchical Multimodal Reasoning
EACL 2026
SurgXBench: Explainable Vision-Language Model Benchmark for Surgery
WACV 2026
See, Think, Learn: A Self-Taught Multimodal Reasoner
WACV 2026
RampWatch: An In-the-Wild Dataset and Text-Guided Detection Framework for Recreational Vessels
WACV 2026
Beyond Faces: A Multimodal Person Clustering for Unconstrained Environments
WACV 2026
Learning Unified Spatio-temporal Representations for Efficient Compressed Video Understanding
WACV 2026
PiSA: A Self-Augmented Data Engine and Training Strategy for 3D Understanding with Large Models
WACV 2026
Geo3DVQA: Evaluating Vision-Language Models for 3D Geospatial Reasoning from Aerial Imagery
WACV 2026
SceneProp: Combining Neural Network and Markov Random Field for Scene-Graph Grounding
WACV 2026
MapVerse: A Benchmark for Geospatial Question Answering on Diverse Real-World Maps
WACV 2026
CAMP-VQA: Caption-Embedded Multimodal Perception for No-Reference Quality Assessment of Compressed Video
WACV 2026
Chain-of-Look Spatial Reasoning for Dense Surgical Instrument Counting
WACV 2026
Unsupervised Memorability Modeling from Tip-of-the-Tongue Retrieval Queries
WACV 2026
DuPLUS: Dual-Prompt Vision-Language Model for Universal Medical Image Segmentation and Prognosis
WACV 2026
DermEVAL: A Dermatologist-Reviewed Benchmark for Multimodal Large Language Models
WACV 2026
Histogram Assisted Quality Aware Generative Model for Resolution Invariant NIR Image Colorization
WACV 2026
ExDDV: A New Dataset for Explainable Deepfake Detection in Video
WACV 2026
ART: Actor-Related Tubelet for Detecting Complex-shaped Action Tubes
WACV 2026
Understanding Human-Like Biases in VLMs via Subjective Face Analytics
WACV 2026
BanglaProtha: Evaluating Vision Language Models in Underrepresented Long-tail Cultural Contexts
WACV 2026
Language Integration in Fine-Tuning Multimodal Large Language Models for Image-Based Regression
WACV 2026
Gaussian Representations for Video
WACV 2026
<
1
2
3
4
5
…
523
>