How to Make a Pizza: Learning a Compositional Layer-Based GAN Model

Dim P. Papadopoulos; Youssef Tamaazousti; Ferda Ofli; Ingmar Weber; Antonio Torralba

2019 CVPR CVPR 2019

How to Make a Pizza: Learning a Compositional Layer-Based GAN Model

Abstract

A food recipe is an ordered set of instructions for preparing a particular dish. From a visual perspective, every instruction step can be seen as a way to change the visual appearance of the dish by adding extra objects (e.g., adding an ingredient) or changing the appearance of the existing ones (e.g., cooking the dish). In this paper, we aim to teach a machine how to make a pizza by building a generative model that mirrors this step-by-step procedure. To do so, we learn composable module operations which are able to either add or remove a particular ingredient. Each operator is designed as a Generative Adversarial Network (GAN). Given only weak image-level supervision, the operators are trained to generate a visual layer that needs to be added to or removed from the existing image. The proposed model is able to decompose an image into an ordered sequence of layers by applying sequentially in the right order the corresponding removing modules. Experimental results on synthetic and real pizza images demonstrate that our proposed model is able to: (1) segment pizza toppings in a weakly- supervised fashion, (2) remove them by revealing what is occluded underneath them (i.e., inpainting), and (3) infer the ordering of the toppings without any depth ordering supervision. Code, data, and models are available online.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning

🧭 Keyword Pioneer — ingredient segmentation

🐣 Hot Topic Early Bird — weakly-supervised learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Dim P. Papadopoulos , Youssef Tamaazousti , Ferda Ofli , Ingmar Weber , Antonio Torralba

Topics

Deep Learning > Models > Generative Models Computer Vision > Generation > Image Generation Deep Learning > Learning Types > Weakly Supervised Learning Deep Learning > Learning Types > Generative Models

Keywords

semantic segmentation weakly supervised learning image synthesis image translation weakly-supervised learning generative adversarial network compositional model layer decomposition ingredient segmentation layer-based generation

Download PDF

Related papers

Fast Single Image Reflection Suppression via Convex Optimization 2019

Learning Video Representations From Correspondence Proposals 2019

ATOM: Accurate Tracking by Overlap Maximization 2019

Visual Tracking via Adaptive Spatially-Regularized Correlation Filters 2019

Edge-Labeling Graph Neural Network for Few-Shot Learning 2019