← Back to papers

2025 NAACL NAACL 2025

VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Models

Authors

Zejun Li , Ruipu Luo , Jiwen Zhang , Minghui Qiu , Xuanjing Huang , zhongyu wei

Related papers

byteSizedLLM@DravidianLangTech 2025: Multimodal Misogyny Meme Detection in Low-Resource Dravidian Languages Using Transliteration-Aware XLM-RoBERTa, ResNet-50, and Attention-BiLSTM 2025

Few-shot Personalization of LLMs with Mis-aligned Responses 2025

NLI under the Microscope: What Atomic Hypothesis Decomposition Reveals 2025

Understanding Figurative Meaning through Explainable Visual Entailment 2025

CogLM: Tracking Cognitive Development of Large Language Models 2025