CLIPDraw: Exploring Text-to-Drawing Synthesis through Language-Image Encoders

Kevin Frans; Lisa Soros; Olaf Witkowski

2022 NIPS NeurIPS 2022

CLIPDraw: Exploring Text-to-Drawing Synthesis through Language-Image Encoders

Abstract

CLIPDraw is an algorithm that synthesizes novel drawings from natural language input. It does not require any additional training; rather, a pre-trained CLIP language-image encoder is used as a metric for maximizing similarity between the given description and a generated drawing. Crucially, CLIPDraw operates over vector strokes rather than pixel images, which biases drawings towards simpler human-recognizable shapes. Results compare CLIPDraw with other synthesis-through-optimization methods, as well as highlight various interesting behaviors of CLIPDraw.

🌉 Interdisciplinary Bridge — Computer Science and Computer Vision and Deep Learning and Machine Learning and Mathematics & Optimization and Natural Language Processing

🧭 Keyword Pioneer — clip encoder

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Kevin Frans , Lisa Soros , Olaf Witkowski

Topics

Machine Learning > Optimization & Theory > Optimization Machine Learning > Application Areas > Efficient Computing Computer Vision > Generation > Image Generation Natural Language Processing > Generation > Text Generation Mathematics & Optimization > Optimization > Continuous Optimization Computer Science > Systems > Computer Graphics Computer Science > Applications > Information Retrieval Deep Learning > Learning Types > Generative Models

Keywords

image generation clip encoder text-to-drawing synthesis vector stroke language-image encoder

Download PDF

Related papers

Transferring Pre-trained Multimodal Representations with Cross-modal Similarity Matching 2022

A Theoretical View on Sparsely Activated Networks 2022

Prune and distill: similar reformatting of image information along rat visual cortex and deep neural networks 2022

Matryoshka Representation Learning 2022

Off-Policy Evaluation with Deficient Support Using Side Information 2022