Video Pixel Networks

Nal Kalchbrenner; Aäron Oord; Karen Simonyan; Ivo Danihelka; Oriol Vinyals; Alex Graves; Koray Kavukcuoglu

2017 ICML ICML 2017

Video Pixel Networks

Abstract

We propose a probabilistic video model, the Video Pixel Network (VPN), that estimates the discrete joint distribution of the raw pixel values in a video. The model and the neural architecture reflect the time, space and color structure of video tensors and encode it as a four-dimensional dependency chain. The VPN approaches the best possible performance on the Moving MNIST benchmark, a leap over the previous state of the art, and the generated videos show only minor deviations from the ground truth. The VPN also produces detailed samples on the action-conditional Robotic Pushing benchmark and generalizes to the motion of novel objects.

🧭 Keyword Pioneer — video modeling

🐣 Hot Topic Early Bird — video generation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Nal Kalchbrenner , Aäron Oord , Karen Simonyan , Ivo Danihelka , Oriol Vinyals , Alex Graves , Koray Kavukcuoglu

Topics

Machine Learning > Core Methods > Representation Learning

Keywords

video generation neural architecture video modeling probabilistic video model discrete joint distribution

Download PDF

Related papers

Bottleneck Conditional Density Estimation 2017

Constrained Policy Optimization 2017

Near-Optimal Design of Experiments via Regret Minimization 2017

Input Convex Neural Networks 2017

An Efficient, Sparsity-Preserving, Online Algorithm for Low-Rank Approximation 2017