Adversarial Text to Continuous Image Generation

Kilichbek Haydarov; Aashiq Muhamed; Xiaoqian Shen; Jovana Lazarevic; Ivan Skorokhodov; Chamuditha Jayanga Galappaththige; Mohamed Elhoseiny

2024 CVPR CVPR 2024

Adversarial Text to Continuous Image Generation

Abstract

Existing GAN-based text-to-image models treat images as 2D pixel arrays. In this paper we approach the text-to-image task from a different perspective where a 2D image is represented as an implicit neural representation (INR). We show that straightforward conditioning of the unconditional INR-based GAN method on text inputs is not enough to achieve good performance. We propose a word-level attention-based weight modulation operator that controls the generation process of INR-GAN based on hypernetworks. Our experiments on benchmark datasets show that HyperCGAN achieves competitive performance to existing pixel-based methods and retains the properties of continuous generative models.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Kilichbek Haydarov , Aashiq Muhamed , Xiaoqian Shen , Jovana Lazarevic , Ivan Skorokhodov , Chamuditha Jayanga Galappaththige , Mohamed Elhoseiny

Topics

Artificial Intelligence > Core AI > Multimodal Learning Deep Learning > Models > Generative Models

Keywords

text-to-image generation implicit neural representation weight modulation

Download PDF

Related papers

DUSt3R: Geometric 3D Vision Made Easy 2024

Bezier Everywhere All at Once: Learning Drivable Lanes as Bezier Graphs 2024

NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows 2024

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization 2024

DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models 2024