Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Core AI
Artificial Intelligence
›
Core AI
›
Multimodal Learning
13057 directly classified papers
Papers per year
2003: 1
2006: 3
2007: 6
2008: 2
2009: 5
2010: 2
2011: 3
2012: 6
2013: 24
2014: 20
2015: 46
2016: 109
2017: 205
2018: 299
2019: 622
2020: 675
2021: 987
2022: 1084
2023: 1697
2024: 2500
2025: 3654
2026: 1107
Papers
Semantic Map Guided Bird's-Eye View Learning for Online HD Map Construction
WACV 2026
DETONATE – A Benchmark for Text-to-Image Alignment and Kernelized Direct Preference Optimization
AAAI 2026
TreeBridge: Aligning LLM Embeddings in Industrial Recommender Systems
AAAI 2026
See More, Store Less: Memory-Efficient Resolution for Video Moment Retrieval
EACL 2026
Multimodal Graph Representation Learning over Arbitrary Sets of Modalities
WACV 2026
Journey Before Destination: On the importance of Visual Faithfulness in Slow Thinking
EACL 2026
TRACE: A Framework for Analyzing and Enhancing Stepwise Reasoning in Vision-Language Models
EACL 2026
Do Images Speak Louder than Words? Investigating the Effect of Textual Misinformation in VLMs
EACL 2026
Can MLLMs Find Their Way in a City? Exploring Emergent Navigation from Web-Scale Knowledge
EACL 2026
VisDoT : Enhancing Visual Reasoning through Human-Like Interpretation Grounding and Decomposition of Thought
EACL 2026
A Unified View on Emotion Representation in Large Language Models
EACL 2026
Chat-Ghosting: Methods for Auto-Completion in Dialog Systems
EACL 2026
Do Audio LLMs Really LISTEN, or Just Transcribe? Measuring Lexical vs. Acoustic Emotion Cues Reliance
EACL 2026
ExStrucTiny: A Benchmark for Schema-Variable Structured Information Extraction from Document Images
EACL 2026
DeepInsert: Early Layer Bypass for Efficient and Performant Multimodal Understanding
EACL 2026
Surprisal from Larger Transformer-based Language Models Predicts fMRI Data More Poorly
EACL 2026
On the Additive Compositionality of Task Vectors in Vision–Language Models
EACL 2026
FiMMIA: scaling semantic perturbation-based membership inference across modalities
EACL 2026
Mask What Matters: Mitigating Object Hallucinations in Multimodal Large Language Models with Object-Aligned Visual Contrastive Decoding
EACL 2026
Bring the Apple, Not the Sofa: Impact of Irrelevant Context in Embodied AI Commands on VLA Models
EACL 2026
Compact Multimodal Language Models as Robust OCR Alternatives for Noisy Textual Clinical Reports
EACL 2026
Adapting Vision-Language Models for E-commerce Understanding at Scale
EACL 2026
TechING: Towards Real World Technical Image Understanding via VLMs
EACL 2026
Unlocking Large Audio-Language Models for Interactive Language Learning
EACL 2026
Benchmarking Direct Preference Optimization for Medical Large Vision–Language Models
EACL 2026
<
1
2
3
4
5
…
523
>