2025 NAACL NAACL 2025

VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Models