Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Learning Types
Machine Learning
›
Learning Types
›
Multi-Modal Learning
1213 directly classified papers
Papers per year
2007: 2
2008: 1
2009: 1
2011: 2
2012: 5
2013: 5
2014: 1
2015: 5
2016: 8
2017: 21
2018: 42
2019: 42
2020: 69
2021: 72
2022: 149
2023: 143
2024: 258
2025: 370
2026: 17
Papers
Towards Reliable Large Audio Language Model
ACL 2025
Cross-Aligned Fusion for Multimodal Understanding
WACV 2025
Do Mentioned Items Truly Matter? Enhancing Conversational Recommender Systems with Causal Intervention and Large Language Models
IJCAI 2025
Going Beyond Consistency: Target-oriented Multi-view Graph Neural Network
IJCAI 2025
Findings of the Shared Task on Misogyny Meme Detection: DravidianLangTech@NAACL 2025
NAACL 2025
CLaMP 3: Universal Music Information Retrieval Across Unaligned Modalities and Unseen Languages
ACL 2025
Dll5143@DravidianLangTech 2025: Majority Voting-Based Framework for Misogyny Meme Detection in Tamil and Malayalam
NAACL 2025
From Text to Multi-Modal: Advancing Low-Resource-Language Translation through Synthetic Data Generation and Cross-Modal Alignments
NAACL 2025
Exploring Multimodal Foundation AI and Expert-in-the-Loop for Sustainable Management of Wild Salmon Fisheries in Indigenous Rivers
IJCAI 2025
To Ask or Not to Ask? Detecting Absence of Information in Vision and Language Navigation
WACV 2025
DMPT: Decoupled Modality-Aware Prompt Tuning for Multi-Modal Object Re-Identification
WACV 2025
RGB-D Video Mirror Detection
WACV 2025
Mastering Collaborative Multi-modal Data Selection: A Focus on Informativeness, Uniqueness, and Representativeness
ICCV 2025
LONG3R: Long Sequence Streaming 3D Reconstruction
ICCV 2025
DualReal: Adaptive Joint Training for Lossless Identity-Motion Fusion in Video Customization
ICCV 2025
Code_Conquerors@DravidianLangTech 2025: Multimodal Misogyny Detection in Dravidian Languages Using Vision Transformer and BERT
NAACL 2025
AGRec: Adapting Autoregressive Decoders with Graph Reasoning for LLM-based Sequential Recommendation
ACL 2025
Fired_from_NLP@DravidianLangTech 2025: A Multimodal Approach for Detecting Misogynistic Content in Tamil and Malayalam Memes
NAACL 2025
CUET_NetworkSociety@DravidianLangTech 2025: A Multimodal Framework to Detect Misogyny Meme in Dravidian Languages
NAACL 2025
Towards Cross-Modality Modeling for Time Series Analytics: A Survey in the LLM Era
IJCAI 2025
STARS: A Unified Framework for Singing Transcription, Alignment, and Refined Style Annotation
ACL 2025
Incomplete Modality Disentangled Representation for Ophthalmic Disease Grading and Diagnosis
AAAI 2025
Beyond Label Semantics: Language-Guided Action Anatomy for Few-shot Action Recognition
ICCV 2025
Oasis: One Image is All You Need for Multimodal Instruction Data Synthesis
ICCV 2025
Lost in Variation? Evaluating NLI Performance in Basque and Spanish Geographical Variants
ACL 2025
<
1
…
10
11
12
…
49
>