Papers
10,699 papers found
Harnessing Multimodal Large Language Models for Multimodal Sequential Recommendation
Yuyang Ye, Zhi Zheng, Yishan Shen et al.
SafeEraser: Enhancing Safety in Multimodal Large Language Models through Multimodal Machine Unlearning
Junkai Chen, Zhijie Deng, Kening Zheng et al.
Bridging Modalities: Improving Universal Multimodal Retrieval by Multimodal Large Language Models
Xin Zhang, Yanzhao Zhang, Wen Xie et al.
MM-CRITIC: A Holistic Evaluation of Large Multimodal Models as Multimodal Critique
Gailun Zeng, Ziyang Luo, Hongzhan Lin et al.
Benchmarking Multimodal Knowledge Conflict for Large Multimodal Models
Yifan Jia, Yuntao Du, Kailin Jiang et al.
MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models
Leyang Shen, Gongwei Chen, Rui Shao et al.
CofiPara: A Coarse-to-fine Paradigm for Multimodal Sarcasm Target Identification with Large Multimodal Models
Zixin Chen, Hongzhan Lin, Ziyang Luo et al.
MMRC: A Large-Scale Benchmark for Understanding Multimodal Large Language Model in Real-World Conversation
Haochen Xue, Feilong Tang, Ming Hu et al.
Using Game Play to Investigate Multimodal and Conversational Grounding in Large Multimodal Models
Sherzod Hakimov, Yerkezhan Abdullayeva, Kushal Koshti et al.
How Do Multimodal Large Language Models Handle Complex Multimodal Reasoning? Placing Them in An Extensible Escape Game
Ziyue Wang, Yurui Dong, Fuwen Luo et al.
SimpleVQA: Multimodal Factuality Evaluation for Multimodal Large Language Models
Xianfu Cheng, Wei Zhang, Shiwei Zhang et al.
Can Multimodal Large Language Models Truly Perform Multimodal In-Context Learning?
Shuo Chen, Zhen Han, Bailan He et al.
Multimodal Causal Reasoning Benchmark: Challenging Multimodal Large Language Models to Discern Causal Links Across Modalities
Zhiyuan Li, Heng Wang, Dongnan Liu et al.
Exploring and Evaluating Multimodal Knowledge Reasoning Consistency of Multimodal Large Language Models
Boyu Jia, Junzhe Zhang, Huixuan Zhang et al.
LMOD: A Large Multimodal Ophthalmology Dataset and Benchmark for Large Vision-Language Models
Zhenyue Qin, Yu Yin, Dylan Campbell et al.
Heuristic-Induced Multimodal Risk Distribution Jailbreak Attack for Multimodal Large Language Models
Teng Ma, Xiaojun Jia, Ranjie Duan et al.
MMUnlearner: Reformulating Multimodal Machine Unlearning in the Era of Multimodal Large Language Models
Jiahao Huo, Yibo Yan, Xu Zheng et al.
Enhancing Large Language Models for Scientific Multimodal Summarization with Multimodal Output
Zusheng Tan, Xinyi Zhong, Jing-Yu Ji et al.
Multimodal Needle in a Haystack: Benchmarking Long-Context Capability of Multimodal Large Language Models
Hengyi Wang, Haizhou Shi, Shiwei Tan et al.
MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models
Tianle Gu, Zeyang Zhou, Kexin Huang et al.
Visual Anchors Are Strong Information Aggregators For Multimodal Large Language Model
Haogeng Liu, Quanzeng You, Xiaotian Han et al.
SciFIBench: Benchmarking Large Multimodal Models for Scientific Figure Interpretation
Jonathan Roberts, Kai Han, Neil Houlsby et al.
Grounding Multimodal Large Language Models in Actions
Andrew Szot, Bogdan Mazoure, Harsh Agrawal et al.
Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare
Hanwei Zhu, Haoning Wu, Yixuan Li et al.