Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Learning Types
Deep Learning
›
Learning Types
›
Multimodal Learning
323 directly classified papers
Papers per year
2014: 1
2015: 1
2017: 8
2018: 11
2019: 11
2020: 27
2021: 23
2022: 46
2023: 35
2024: 53
2025: 104
2026: 3
Papers
Model-free Domain Adaptation for Concealed Multimodal Large-Language Models
WACV 2026
LangPose: Language-Aligned Motion for Robust 3D Human Pose Estimation
WACV 2026
DomainCQA: Crafting Knowledge-Intensive QA from Domain-Specific Charts
AAAI 2026
Fine-Grained Image-Text Correspondence with Cost Aggregation for Open-Vocabulary Part Segmentation
CVPR 2025
Exploiting Vision Language Model for Training-Free 3D Point Cloud OOD Detection via Graph Score Propagation
ICCV 2025
Efficient Visual Place Recognition Through Multimodal Semantic Knowledge Integration
ICCV 2025
LegoSLM: Connecting LLM with Speech Encoder using CTC Posteriors
EMNLP 2025
ENCODER: Entity Mining and Modification Relation Binding for Composed Image Retrieval
AAAI 2025
UniEDU: Toward Unified and Efficient Large Multimodal Models for Educational Tasks
EMNLP 2025
EssayDetect at GenAI Detection Task 2: Guardians of Academic Integrity: Multilingual Detection of AI-Generated Essays
COLING 2025
Adversarial Alignment with Anchor Dragging Drift (A3D2): Multimodal Domain Adaptation with Partially Shifted Modalities
ACL 2025
Aligning Text/Speech Representations from Multimodal Models with MEG Brain Activity During Listening
EMNLP 2025
PresentAgent: Multimodal Agent for Presentation Video Generation
EMNLP 2025
External Memory Matters: Generalizable Object-Action Memory for Retrieval-Augmented Long-Term Video Understanding
IJCAI 2025
Let Modalities Teach Each Other: Modal-Collaborative Knowledge Extraction and Fusion for Multimodal Knowledge Graph Completion
NAACL 2025
From Introspection to Best Practices: Principled Analysis of Demonstrations in Multimodal In-Context Learning
NAACL 2025
TEAM_STRIKERS@DravidianLangTech2025: Misogyny Meme Detection in Tamil Using Multimodal Deep Learning
NAACL 2025
The_Deathly_Hallows@DravidianLangTech 2025: Multimodal Hate Speech Detection in Dravidian Languages
NAACL 2025
Hybrid-Tower: Fine-grained Pseudo-query Interaction and Generation for Text-to-Video Retrieval
ICCV 2025
YNU-HPCC at SemEval-2025 Task 1: Enhancing Multimodal Idiomaticity Representation via LoRA and Hybrid Loss Optimization
SEMEVAL 2025
Cross-Aligned Fusion for Multimodal Understanding
WACV 2025
Unified Multimodal Understanding via Byte-Pair Visual Encoding
ICCV 2025
Can VLMs Actually See and Read? A Survey on Modality Collapse in Vision-Language Models
ACL 2025
Multi-Modal Synergistic Implicit Image Enhancement for Efficient Optical Flow Estimation
CVPR 2025
ClimbingCap: Multi-Modal Dataset and Method for Rock Climbing in World Coordinate
CVPR 2025
<
1
2
3
4
5
…
13
>