Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Core AI
Artificial Intelligence
›
Core AI
›
Multimodal Learning
13057 directly classified papers
Papers per year
2003: 1
2006: 3
2007: 6
2008: 2
2009: 5
2010: 2
2011: 3
2012: 6
2013: 24
2014: 20
2015: 46
2016: 109
2017: 205
2018: 299
2019: 622
2020: 675
2021: 987
2022: 1084
2023: 1697
2024: 2500
2025: 3654
2026: 1107
Papers
Reconstructing Realistic and Relightable Eyes
WACV 2026
Modality and Task Adaptation for Enhanced Zero-shot Composed Image Retrieval
AAAI 2026
Dual-Domain Multimodal Hyperbolic Fusion for Cardiopulmonary Disease Diagnosis in Emergency Care
WACV 2026
Leveraging LLM-GNN Integration for Open-World Question Answering over Knowledge Graphs
EACL 2026
BigTokDetect: A Clinically-Informed Vision–Language Modeling Framework for Detecting Pro-Bigorexia Videos on TikTok
EACL 2026
Coordinates from Context: Using LLMs to Ground Complex Location References
EACL 2026
StarFlow: Generating Structured Workflow Outputs From Sketch Images
EACL 2026
A Computational Approach to Visual Metonymy
EACL 2026
Multimodal Evaluation of Russian-language Architectures
EACL 2026
AfriVox: Probing Multilingual and Accent Robustness of Speech LLMs
EACL 2026
PAL: Personal Adaptive Learner
AAAI 2026
SmartEyes: Plug-and-Play Event Detection for Retail Loss Prevention
AAAI 2026
Docora: A System for Interactive Knowledge Extraction and Visualization from Scientific PDFs
AAAI 2026
AirNavigation: Let UAV Navigation Tell Its Own Story
AAAI 2026
MemoVision: A Digital Catalog for Everyday Interactions
AAAI 2026
MulTiCast: A Multimodal Time Series Forecasting System
AAAI 2026
City of Light (COL): A City-Scale, Geo-Anchored Urban Simulator with High-Throughput Multi-Sensor Streams
AAAI 2026
VitalDiagnosis: AI-Driven Ecosystem for 24/7 Vital Monitoring and Chronic Disease Management
AAAI 2026
ATM: Enhanced Alignment for Text-to-Motion Generation
WACV 2026
MR-Pruner: Training-free Multi-resolution Visual Token Pruning for Multi-modal Large Language Models
WACV 2026
Can We Challenge Open-Vocabulary Object Detectors with Generated Content in Street Scenes?
WACV 2026
DreamCatcher: Efficient Multi-Concept Customization via Representation Finetuning
WACV 2026
Training-free Conditional Image Embedding Framework Leveraging Large Vision Language Models
WACV 2026
Evaluating the Capability of Video Question Generation for Expert Knowledge Elicitation
WACV 2026
Broadcast2Pitch: Game State Reconstruction from Unconstrained Soccer Videos
WACV 2026
<
1
…
9
10
11
…
523
>