Now You Shake Me: Towards Automatic 4D Cinema

Yuhao Zhou; Makarand Tapaswi; Sanja Fidler

2018 CVPR CVPR 2018

Now You Shake Me: Towards Automatic 4D Cinema

Abstract

We are interested in enabling automatic 4D cinema by parsing physical and special effects from untrimmed movies. These include effects such as physical interactions, water splashing, light, and shaking, and are grounded to either a character in the scene or the camera. We collect a new dataset referred to as the Movie4D dataset which annotates over 9K effects in 63 movies. We propose a Conditional Random Field model atop a neural network that brings together visual and audio information, as well as semantics in the form of person tracks. Our model further exploits correlations of effects between different characters in the clip as well as across movie threads. We propose effect detection and classification as two tasks, and present results along with ablation studies on our dataset, paving the way towards 4D cinema in everyone’s homes.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Machine Learning and Mathematics & Optimization

🧭 Keyword Pioneer — audio-visual learning

🐣 Hot Topic Early Bird — temporal modeling

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Yuhao Zhou , Makarand Tapaswi , Sanja Fidler

Topics

Artificial Intelligence > Core AI > Multimodal Learning Machine Learning > Core Methods > Classification Mathematics & Optimization > Optimization > Stochastic Methods Machine Learning > Learning Types > Multi-Modal Learning Computer Vision > Analysis > Video Understanding Machine Learning > Core Methods > Structured Prediction

Keywords

temporal modeling structured prediction object detection multimodal learning audio-visual learning video understanding conditional random field special effect effect detection

Download PDF

Related papers

Multi-Shot Pedestrian Re-Identification via Sequential Decision Making 2018

Multi-Cue Correlation Filters for Robust Visual Tracking 2018

Pointwise Convolutional Neural Networks 2018

Learning Attentions: Residual Attentional Siamese Network for High Performance Online Visual Tracking 2018

Image Generation From Scene Graphs 2018