2024 ECCV ECCV 2024

Weakly-Supervised Spatio-Temporal Video Grounding with Variational Cross-Modal Alignment

Authors