Compositional Video Synthesis with Action Graphs

Amir Bar; Roei Herzig; Xiaolong Wang; Anna Rohrbach; Gal Chechik; Trevor Darrell; Amir Globerson

2021 ICML ICML 2021

Compositional Video Synthesis with Action Graphs

Abstract

Videos of actions are complex signals containing rich compositional structure in space and time. Current video generation methods lack the ability to condition the generation on multiple coordinated and potentially simultaneous timed actions. To address this challenge, we propose to represent the actions in a graph structure called Action Graph and present the new "Action Graph To Video" synthesis task. Our generative model for this task (AG2Vid) disentangles motion and appearance features, and by incorporating a scheduling mechanism for actions facilitates a timely and coordinated video generation. We train and evaluate AG2Vid on CATER and Something-Something V2 datasets, which results in videos that have better visual quality and semantic consistency compared to baselines. Finally, our model demonstrates zero-shot abilities by synthesizing novel compositions of the learned actions.

🌉 Interdisciplinary Bridge — Computer Vision and Machine Learning

🐣 Hot Topic Early Bird — semantic consistency

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Amir Bar , Roei Herzig , Xiaolong Wang , Anna Rohrbach , Gal Chechik , Trevor Darrell , Amir Globerson

Topics

Machine Learning > Core Methods > Representation Learning Computer Vision > Generation > Video Generation

Keywords

representation learning action recognition video generation semantic consistency

Download PDF

Related papers

GRAND: Graph Neural Diffusion 2021

Almost Optimal Anytime Algorithm for Batched Multi-Armed Bandits 2021

Straight to the Gradient: Learning to Use Novel Tokens for Neural Text Generation 2021

Differentiable Dynamic Quantization with Mixed Precision and Adaptive Resolution 2021

Dataset Dynamics via Gradient Flows in Probability Space 2021