UFOGen: You Forward Once Large Scale Text-to-Image Generation via Diffusion GANs

Yanwu Xu; Yang Zhao; Zhisheng Xiao; Tingbo Hou

2024 CVPR CVPR 2024

UFOGen: You Forward Once Large Scale Text-to-Image Generation via Diffusion GANs

Abstract

Text-to-image diffusion models have demonstrated remarkable capabilities in transforming text prompts into coherent images yet the computational cost of the multi-step inference remains a persistent challenge. To address this issue we present UFOGen a novel generative model designed for ultra-fast one-step text-to-image generation. In contrast to conventional approaches that focus on improving samplers or employing distillation techniques for diffusion models UFOGen adopts a hybrid methodology integrating diffusion models with a GAN objective. Leveraging a newly introduced diffusion-GAN objective and initialization with pre-trained diffusion models UFOGen excels in efficiently generating high-quality images conditioned on textual descriptions in a single step. Beyond traditional text-to-image generation UFOGen showcases versatility in applications. Notably UFOGen stands among the pioneering models enabling one-step text-to-image generation and diverse downstream tasks presenting a significant advancement in the landscape of efficient generative models.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning

🧭 Keyword Pioneer — gan objective

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Yanwu Xu , Yang Zhao , Zhisheng Xiao , Tingbo Hou

Topics

Deep Learning > Models > Diffusion Models Deep Learning > Models > Generative Models Computer Vision > Generation > Image Generation

Keywords

text-to-image generation generative model diffusion model generative adversarial network one-step generation gan objective

Download PDF

Related papers

DUSt3R: Geometric 3D Vision Made Easy 2024

Bezier Everywhere All at Once: Learning Drivable Lanes as Bezier Graphs 2024

NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows 2024

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization 2024

DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models 2024