Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Learning Types
Machine Learning
›
Learning Types
›
Multimodal Learning
85 directly classified papers
Papers per year
2017: 2
2019: 3
2020: 4
2021: 4
2022: 8
2023: 12
2024: 15
2025: 37
Papers
HumorDB: Can AI understand graphical humor?
ICCV 2025
IteRPrimE: Zero-shot Referring Image Segmentation with Iterative Grad-CAM Refinement and Primary Word Emphasis
AAAI 2025
Multimodal LLMs as Customized Reward Models for Text-to-Image Generation
ICCV 2025
Does Your Vision-Language Model Get Lost in the Long Video Sampling Dilemma?
ICCV 2025
MHBench: Demystifying Motion Hallucination in VideoLLMs
AAAI 2025
MMDocIR: Benchmarking Multimodal Retrieval for Long Documents
EMNLP 2025
Exploring Artificial Image Generation for Stance Detection
EMNLP 2025
STiL: Semi-supervised Tabular-Image Learning for Comprehensive Task-Relevant Information Exploration in Multimodal Classification
CVPR 2025
Zero-shot Multimodal Document Retrieval via Cross-modal Question Generation
EMNLP 2025
Streaming VideoLLMs for Real-Time Procedural Video Understanding
ICCV 2025
Multimodal Prior Learning with Double Constraint Alignment for Snapshot Spectral Compressive Imaging
IJCAI 2025
Unmasking Deceptive Visuals: Benchmarking Multimodal Large Language Models on Misleading Chart Question Answering
EMNLP 2025
Make VLM Recognize Visual Hallucination on Cartoon Character Image with Pose Information
WACV 2025
Unsupervised Video Highlight Detection by Learning from Audio and Visual Recurrence
WACV 2025
Language Model Based Text-to-Audio Generation: Anti-Causally Aligned Collaborative Residual Transformers
EMNLP 2025
Forecasting Credit Ratings: A Case Study where Traditional Methods Outperform Generative LLMs
COLING 2025
Thought2Text: Text Generation from EEG Signal using Large Language Models (LLMs)
NAACL 2025
The American Sign Language Knowledge Graph: Infusing ASL Models with Linguistic Knowledge
NAACL 2025
SSNCSE@DravidianLangTech 2025: Multimodal Hate Speech Detection in Dravidian Languages
NAACL 2025
Overview of the Shared Task on Multimodal Hate Speech Detection in Dravidian languages: DravidianLangTech@NAACL 2025
NAACL 2025
HerWILL@DravidianLangTech 2025: Ensemble Approach for Misogyny Detection in Memes Using Pre-trained Text and Vision Transformers
NAACL 2025
Podcast Outcasts: Understanding Rumble’s Podcast Dynamics
NAACL 2025
Sentiment Analysis on Video Transcripts: Comparing the Value of Textual and Multimodal Annotations
NAACL 2025
UnifiedVisual: A Framework for Constructing Unified Vision-Language Datasets
EMNLP 2025
Debiased Multimodal Understanding for Human Language Sequences
AAAI 2025
<
1
2
3
4
>