2024 ICML ICML 2024

STELLA: Continual Audio-Video Pre-training with SpatioTemporal Localized Alignment