Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language

Seonghyeon Nam; Yunji Kim; Seon Joo Kim

2018 NIPS NeurIPS 2018

Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language

Abstract

This paper addresses the problem of manipulating images using natural language description. Our task aims to semantically modify visual attributes of an object in an image according to the text describing the new visual appearance. Although existing methods synthesize images having new attributes, they do not fully preserve text-irrelevant contents of the original image. In this paper, we propose the text-adaptive generative adversarial network (TAGAN) to generate semantically manipulated images while preserving text-irrelevant contents. The key to our method is the text-adaptive discriminator that creates word level local discriminators according to input text to classify fine-grained attributes independently. With this discriminator, the generator learns to generate images where only regions that correspond to the given text is modified. Experimental results show that our method outperforms existing methods on CUB and Oxford-102 datasets, and our results were mostly preferred on a user study. Extensive analysis shows that our method is able to effectively disentangle visual attributes and produce pleasing outputs.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning

🧭 Keyword Pioneer — semantic modification

🐣 Hot Topic Early Bird — feature disentanglement

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Seonghyeon Nam , Yunji Kim , Seon Joo Kim

Topics

Machine Learning > Learning Types > Adversarial Learning Deep Learning > Models > Generative Models Computer Vision > Generation > Image Generation Deep Learning > Learning Types > Adversarial Learning Computer Vision > Generation > Image Editing Deep Learning > Learning Types > Generative Models

Keywords

image generation natural language processing feature disentanglement natural language generative adversarial network image manipulation semantic modification

Download PDF

Related papers

Maximum Causal Tsallis Entropy Imitation Learning 2018

Recurrent World Models Facilitate Policy Evolution 2018

Bandit Learning in Concave N-Person Games 2018

Algorithmic Assurance: An Active Approach to Algorithmic Testing using Bayesian Optimisation 2018

PAC-Bayes bounds for stable algorithms with instance-dependent priors 2018