Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Models
Deep Learning
›
Models
›
Multimodal Learning
24 directly classified papers
Papers per year
2019: 1
2020: 3
2022: 3
2024: 7
2025: 9
2026: 1
Papers
Model-free Domain Adaptation for Concealed Multimodal Large-Language Models
WACV 2026
ChartEdit: How Far Are MLLMs From Automating Chart Analysis? Evaluating MLLMs’ Capability via Chart Editing
ACL 2025
MLAN: Language-Based Instruction Tuning Preserves and Transfers Knowledge in Multimodal Language Models
ACL 2025
Benchmarking Table Extraction: Multimodal LLMs vs Traditional OCR
ACL 2025
On Domain-Adaptive Post-Training for Multimodal Large Language Models
EMNLP 2025
The Photographer's Eye: Teaching Multimodal Large Language Models to See, and Critique Like Photographers
CVPR 2025
GroundingFace: Fine-grained Face Understanding via Pixel Grounding Multimodal Large Language Model
CVPR 2025
UniEDU: Toward Unified and Efficient Large Multimodal Models for Educational Tasks
EMNLP 2025
mOSCAR: A Large-scale Multilingual and Multimodal Document-level Corpus
ACL 2025
PunchBench: Benchmarking MLLMs in Multimodal Punchline Comprehension
ACL 2025
ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models
NIPS 2024
Multimodal Instruction Tuning with Conditional Mixture of LoRA
ACL 2024
L+M-24: Building a Dataset for Language+Molecules @ ACL 2024
ACL 2024
ALMol: Aligned Language-Molecule Translation LLMs through Offline Preference Contrastive Optimisation
ACL 2024
PRESTO: Progressive Pretraining Enhances Synthetic Chemistry Outcomes
EMNLP 2024
MobileVLM: A Vision-Language Model for Better Intra- and Inter-UI Understanding
EMNLP 2024
Make Prompts Adaptable: Bayesian Modeling for Vision-Language Prompt Learning with Data-Dependent Prior
AAAI 2024
Mind Reader: Reconstructing complex images from brain activities
NIPS 2022
CapOnImage: Context-driven Dense-Captioning on Image
EMNLP 2022
ViLMedic: a framework for research at the intersection of vision and language in medical AI
ACL 2022
ViLBERTScore: Evaluating Image Caption Using Vision-and-Language BERT
EMNLP 2020
Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning
EMNLP 2020
Learning to Represent Image and Text with Denotation Graph
EMNLP 2020
Words Can Shift: Dynamically Adjusting Word Representations Using Nonverbal Behaviors
AAAI 2019
<
1
>