Papers
10,699 papers found
MM-SOC: Benchmarking Multimodal Large Language Models in Social Media Platforms
Yiqiao Jin, Minje Choi, Gaurav Verma et al.
An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models
Xiongtao Zhou, Jie He, Yuhua Ke et al.
MM-LLMs: Recent Advances in MultiModal Large Language Models
Duzhen Zhang, Yahan Yu, Jiahua Dong et al.
Aligning Large Multimodal Models with Factually Augmented RLHF
Zhiqing Sun, Sheng Shen, Shengcao Cao et al.
The Revolution of Multimodal Large Language Models: A Survey
Davide Caffagni, Federico Cocchi, Luca Barsellotti et al.
Dallah: A Dialect-Aware Multimodal Large Language Model for Arabic
Fakhraddin Alwajih, Gagan Bhatia, Muhammad Abdul-Mageed
Optimizing Multimodal Large Language Models for Detection of Alcohol Advertisements via Adaptive Prompting
Daniel Cabrera Lozoya, Jiahe Liu, Simon D’Alfonso et al.
MAIRA at RRG24: A specialised large multimodal model for radiology report generation
Shaury Srivastav, Mercy Ranjit, Fernando Pérez-García et al.
iHealth-Chile-1 at RRG24: In-context Learning and Finetuning of a Large Multimodal Model for Radiology Report Generation
Diego Campanini, Oscar Loch, Pablo Messina et al.
Can Multimodal Large Language Models Understand Spatial Relations?
Jingping Liu, Ziyan Liu, Zhedong Cen et al.
Con Instruction: Universal Jailbreaking of Multimodal Large Language Models via Non-Textual Modalities
Jiahui Geng, Thy Thy Tran, Preslav Nakov et al.
AdamMeme: Adaptively Probe the Reasoning Capacity of Multimodal Large Language Models on Harmfulness
Zixin Chen, Hongzhan Lin, Kaixin Li et al.
Modality-Aware Neuron Pruning for Unlearning in Multimodal Large Language Models
Zheyuan Liu, Guangyao Dou, Xiangchi Yuan et al.
Evaluating Multimodal Large Language Models on Video Captioning via Monte Carlo Tree Search
Linhao Yu, Xingguang Ji, Yahui Liu et al.
Unsolvable Problem Detection: Robust Understanding Evaluation for Large Multimodal Models
Atsuyuki Miyai, Jingkang Yang, Jingyang Zhang et al.
ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation
Xuanle Zhao, Xianzhen Luo, Qi Shi et al.
ActiView: Evaluating Active Perception Ability for Multimodal Large Language Models
Ziyue Wang, Chi Chen, Fuwen Luo et al.
VQAGuider: Guiding Multimodal Large Language Models to Answer Complex Video Questions
Yuyan Chen, Jiyuan Jia, Jiaxin Lu et al.
MCS-Bench: A Comprehensive Benchmark for Evaluating Multimodal Large Language Models in Chinese Classical Studies
Yang Liu, Jiahuan Cao, Hiuyi Cheng et al.
GODBench: A Benchmark for Multimodal Large Language Models in Video Comment Art
Yiming Lei, Chenkai Zhang, Zeming Liu et al.
Single-to-mix Modality Alignment with Multimodal Large Language Model for Document Image Machine Translation
Yupu Liang, Yaping Zhang, Zhiyang Zhang et al.
HiDe-LLaVA: Hierarchical Decoupling for Continual Instruction Tuning of Multimodal Large Language Model
Haiyang Guo, Fanhu Zeng, Ziwei Xiang et al.
Investigating and Enhancing the Robustness of Large Multimodal Models Against Temporal Inconsistency
Jiafeng Liang, Shixin Jiang, Xuan Dong et al.
HiddenDetect: Detecting Jailbreak Attacks against Multimodal Large Language Models via Monitoring Hidden States
Yilei Jiang, Xinyan Gao, Tianshuo Peng et al.