2025 ICML ICML 2025

OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction