2024 ICML ICML 2024

Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization