Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Learning Types
Deep Learning
›
Learning Types
›
Multi-Modal Learning
3194 directly classified papers
Papers per year
2003: 1
2010: 1
2011: 1
2013: 5
2014: 3
2015: 9
2016: 23
2017: 49
2018: 78
2019: 158
2020: 223
2021: 261
2022: 354
2023: 471
2024: 705
2025: 835
2026: 17
Papers
Multimodal Inverse Attention Network with Intrinsic Discriminant Feature Exploitation for Fake News Detection
IJCAI 2025
Narrating the Video: Boosting Text-Video Retrieval via Comprehensive Utilization of Frame-Level Captions
CVPR 2025
Unified Molecule-Text Language Model with Discrete Token Representation
IJCAI 2025
Stable Diffusion Models are Secretly Good at Visual In-Context Learning
ICCV 2025
Exploring Multimodal Foundation AI and Expert-in-the-Loop for Sustainable Management of Wild Salmon Fisheries in Indigenous Rivers
IJCAI 2025
Multi-View Incremental Learning with Structured Hebbian Plasticity for Enhanced Fusion Efficiency
AAAI 2025
CureGraph: Contrastive Multi-Modal Graph Representation Learning for Urban Living Circle Health Profiling and Prediction (Abstract Reprint)
IJCAI 2025
ObjVariantEnsemble: Advancing Point Cloud LLM Evaluation in Challenging Scenes with Subtly Distinguished Objects
AAAI 2025
Geminio: Language-Guided Gradient Inversion Attacks in Federated Learning
ICCV 2025
VQTalker: Towards Multilingual Talking Avatars Through Facial Motion Tokenization
AAAI 2025
Balancing Task-invariant Interaction and Task-specific Adaptation for Unified Image Fusion
ICCV 2025
Towards Open-Vocabulary Remote Sensing Image Semantic Segmentation
AAAI 2025
Few-Shot Audio-Visual Class-Incremental Learning with Temporal Prompting and Regularization
AAAI 2025
Uni-NaVid: A Video-based Vision-Language-Action Model for Unifying Embodied Navigation Tasks
RSS 2025
Fix-CLIP: Dual-Branch Hierarchical Contrastive Learning via Synthetic Captions for Better Understanding of Long Text
ICCV 2025
Scaling Omni-modal Pretraining with Multimodal Context: Advancing Universal Representation Learning Across Modalities
ICCV 2025
Benchmarking Multimodal Large Language Models Against Image Corruptions
ICCV 2025
AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction
ICCV 2025
AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning
ICCV 2025
FedMVP: Federated Multimodal Visual Prompt Tuning for Vision-Language Models
ICCV 2025
UIPro: Unleashing Superior Interaction Capability For GUI Agents
ICCV 2025
Utilizing Vision-Language Models for Detection of Leaf-Based Diseases in Tomatoes
AAAI 2025
Large Multi-modal Models Can Interpret Features in Large Multi-modal Models
ICCV 2025
Flow4Agent: Long-form Video Understanding via Motion Prior from Optical Flow
ICCV 2025
NeKo: Cross-Modality Post-Recognition Error Correction with Tasks-Guided Mixture-of-Experts Language Model
ACL 2025
<
1
…
9
10
11
…
128
>