HoloGAN: Unsupervised Learning of 3D Representations From Natural Images

Thu Nguyen-Phuoc; Chuan Li; Lucas Theis; Christian Richardt; Yong-Liang Yang

2019 ICCV ICCV 2019

HoloGAN: Unsupervised Learning of 3D Representations From Natural Images

Abstract

We propose a novel generative adversarial network (GAN) for the task of unsupervised learning of 3D representations from natural images. Most generative models rely on 2D kernels to generate images and make few assumptions about the 3D world. These models therefore tend to create blurry images or artefacts in tasks that require a strong 3D understanding, such as novel-view synthesis. HoloGAN instead learns a 3D representation of the world, and to render this representation in a realistic manner. Unlike other GANs, HoloGAN provides explicit control over the pose of generated objects through rigid-body transformations of the learnt 3D features. Our experiments show that using explicit 3D features enables HoloGAN to disentangle 3D pose and identity, which is further decomposed into shape and appearance, while still being able to generate images with similar or higher visual quality than other generative models. HoloGAN can be trained end-to-end from unlabelled 2D images only. Particularly, we do not require pose labels, 3D shapes, or multiple views of the same objects. This shows that HoloGAN is the first generative model that learns 3D representations from natural images in an entirely unsupervised manner.

🌉 Interdisciplinary Bridge — Computer Vision and Machine Learning

🧭 Keyword Pioneer — 3d representation learning

🐣 Hot Topic Early Bird — novel-view synthesis

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Thu Nguyen-Phuoc , Chuan Li , Lucas Theis , Christian Richardt , Yong-Liang Yang

Topics

Machine Learning > Learning Types > Unsupervised Learning Computer Vision > Analysis > 3D Vision Computer Vision > Generation > Image Generation

Keywords

unsupervised learning generative adversarial network novel-view synthesis 3d representation learning rigid-body transformation

Download PDF

Related papers

Hierarchical Self-Attention Network for Action Localization in Videos 2019

StructureFlow: Image Inpainting via Structure-Aware Appearance Flow 2019

Overcoming Catastrophic Forgetting With Unlabeled Data in the Wild 2019

Compact Trilinear Interaction for Visual Question Answering 2019

A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose Estimation From a Single Depth Image 2019