Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Learning Types
Machine Learning
›
Learning Types
›
Multi-Modal Learning
1213 directly classified papers
Papers per year
2007: 2
2008: 1
2009: 1
2011: 2
2012: 5
2013: 5
2014: 1
2015: 5
2016: 8
2017: 21
2018: 42
2019: 42
2020: 69
2021: 72
2022: 149
2023: 143
2024: 258
2025: 370
2026: 17
Papers
MAVFlow: Preserving Paralinguistic Elements with Conditional Flow Matching for Zero-Shot AV2AV Multilingual Translation
ICCV 2025
Robust Multi-View Learning via Representation Fusion of Sample-Level Attention and Alignment of Simulated Perturbation
ICCV 2025
Beyond Walking: A Large-Scale Image-Text Benchmark for Text-based Person Anomaly Search
ICCV 2025
DALIP: Distribution Alignment-based Language-Image Pre-Training for Domain-Specific Data
ICCV 2025
AffordDexGrasp: Open-set Language-guided Dexterous Grasp with Generalizable-Instructive Affordance
ICCV 2025
Beyond RGB: Adaptive Parallel Processing for RAW Object Detection
ICCV 2025
RGB-D Video Mirror Detection
WACV 2025
TrenTeam at Multilingual Counterspeech Generation: Multilingual Passage Re-Ranking Approaches for Knowledge-Driven Counterspeech Generation Against Hate
COLING 2025
FiVE-Bench: A Fine-grained Video Editing Benchmark for Evaluating Emerging Diffusion and Rectified Flow Models
ICCV 2025
DAMMFND: Domain-Aware Multimodal Multi-view Fake News Detection
AAAI 2025
IRT-Router: Effective and Interpretable Multi-LLM Routing via Item Response Theory
ACL 2025
ReFu: Recursive Fusion for Exemplar-Free 3D Class-Incremental Learning
WACV 2025
Cross-Aligned Fusion for Multimodal Understanding
WACV 2025
Unleashing Potentials of Vision-Language Models for Zero-Shot HOI Detection
WACV 2025
UnCo: Uncertainty-Driven Collaborative Framework of Large and Small Models for Grounded Multimodal NER
EMNLP 2025
Program Synthesis Benchmark for Visual Programming in XLogoOnline Environment
ACL 2025
Advancing Fine-Grained Visual Understanding with Multi-Scale Alignment in Multi-Modal Models
EMNLP 2025
OVQA: A Dataset for Visual Question Answering and Multimodal Research in Odia Language
COLING 2025
Who is in the Spotlight: The Hidden Bias Undermining Multimodal Retrieval-Augmented Generation
EMNLP 2025
Proxy-Driven Robust Multimodal Sentiment Analysis with Incomplete Data
ACL 2025
NCL-UoR at SemEval-2025 Task 3: Detecting Multilingual Hallucination and Related Observable Overgeneration Text Spans with Modified RefChecker and Modified SeflCheckGPT
SEMEVAL 2025
UoR-NCL at SemEval-2025 Task 1: Using Generative LLMs and CLIP Models for Multilingual Multimodal Idiomaticity Representation
SEMEVAL 2025
FiRC-NLP at SemEval-2025 Task 11: To Prompt or to Fine-Tune? Approaches for Multilingual Emotion Classification
SEMEVAL 2025
DynamicNER: A Dynamic, Multilingual, and Fine-Grained Dataset for LLM-based Named Entity Recognition
EMNLP 2025
ScaleMatch: Multi-scale Consistency Enhancement for Semi-supervised Semantic Segmentation
AAAI 2025
<
1
…
8
9
10
…
49
>