Papers

10,699 papers found
Context-Informed Machine Translation of Manga using Multimodal Large Language Models
Philip Lippmann, Konrad Skublicki, Joshua Tanner et al.
2025 COLING
2024 CVPR
ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts
Mu Cai, Haotian Liu, Siva Karthik Mustikovela et al.
2024 CVPR
2024 CVPR
GLaMM: Pixel Grounding Large Multimodal Model
Hanoona Rasheed, Muhammad Maaz, Sahal Shaji et al.
2024 CVPR
On the Robustness of Large Multimodal Models Against Image Adversarial Attacks
Xuanming Cui, Alejandro Aparcedo, Young Kyun Jang et al.
2024 CVPR
2024 CVPR