Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Learning Types
Machine Learning
›
Learning Types
›
Multi-Modal Learning
1213 directly classified papers
Papers per year
2007: 2
2008: 1
2009: 1
2011: 2
2012: 5
2013: 5
2014: 1
2015: 5
2016: 8
2017: 21
2018: 42
2019: 42
2020: 69
2021: 72
2022: 149
2023: 143
2024: 258
2025: 370
2026: 17
Papers
COVTrack: Continuous Open-Vocabulary Tracking via Adaptive Multi-Cue Fusion
ICCV 2025
ReEdit: Multimodal Exemplar-Based Image Editing
WACV 2025
Performance Gap in Entity Knowledge Extraction Across Modalities in Vision Language Models
ACL 2025
LILaC: Late Interacting in Layered Component Graph for Open-domain Multimodal Multihop Retrieval
EMNLP 2025
Optimizing Vision-Language Model for Road Crossing Intention Estimation
WACV 2025
Text Takes Over: A Study of Modality Bias in Multimodal Intent Detection
EMNLP 2025
HintsOfTruth: A Multimodal Checkworthiness Detection Dataset with Real and Synthetic Claims
ACL 2025
Aligning Text/Speech Representations from Multimodal Models with MEG Brain Activity During Listening
EMNLP 2025
SSNTrio @ DravidianLangTech 2025: Hybrid Approach for Hate Speech Detection in Dravidian Languages with Text and Audio Modalities
NAACL 2025
RCI: A Score for Evaluating Global and Local Reasoning in Multimodal Benchmarks
EMNLP 2025
QUPID: Quantified Understanding for Enhanced Performance, Insights, and Decisions in Korean Search Engines
ACL 2025
A Survey on Multi-modal Intent Recognition: Recent Advances and New Frontiers
EMNLP 2025
Bridging Traffic State and Trajectory for Dynamic Road Network and Trajectory Representation Learning
AAAI 2025
MICE: Mixture of Image Captioning Experts Augmented e-Commerce Product Attribute Value Extraction
ACL 2025
Beyond Data Quantity: Key Factors Driving Performance in Multilingual Language Models
COLING 2025
Semantically Conditioned Prompts for Visual Recognition under Missing Modality Scenarios
WACV 2025
OccFlowNet: Occupancy Estimation via Differentiable Rendering and Occupancy Flow
WACV 2025
Visual Cues Enhance Predictive Turn-Taking for Two-Party Human Interaction
ACL 2025
Consensus-Guided Incomplete Multi-view Clustering via Cross-view Affinities Learning
IJCAI 2025
Towards Cross-Modality Modeling for Time Series Analytics: A Survey in the LLM Era
IJCAI 2025
cantnlp@DravidianLangTech-2025: A Bag-of-Sounds Approach to Multimodal Hate Speech Detection
NAACL 2025
Comparing Bad Apples to Good Oranges Aligning Large Language Models via Joint Preference Optimization
ACL 2025
End-to-End Multi-Modal Diffusion Mamba
ICCV 2025
A Picture is Worth a Thousand (Correct) Captions: A Vision-Guided Judge-Corrector System for Multimodal Machine Translation
IJCNLP 2025
Self-supervised Trusted Contrastive Multi-view Clustering with Uncertainty Refined
AAAI 2025
<
1
…
9
10
11
…
49
>