Spatially-Adaptive Pixelwise Networks for Fast Image Translation

Tamar Rott Shaham; MICHAEL GHARBI; Richard Zhang; Eli Shechtman; Tomer Michaeli

2021 CVPR CVPR 2021

Spatially-Adaptive Pixelwise Networks for Fast Image Translation

Abstract

We introduce a new generator architecture, aimed at fast and efficient high-resolution image-to-image translation. We design the generator to be an extremely lightweight function of the full-resolution image. In fact, we use pixel-wise networks; that is, each pixel is processed independently of others, through a composition of simple affine transformations and nonlinearities. We take three important steps to equip such a seemingly simple function with adequate expressivity. First, the parameters of the pixel-wise networks are spatially varying so they can represent a broader function class than simple 1x1 convolutions. Second, these parameters are predicted by a fast convolutional network that processes an aggressively low-resolution representation of the input. Third, we augment the input image with a sinusoidal encoding of spatial coordinates, which provides an effective inductive bias for generating realistic novel high-frequency image content. As a result, our model is up to 18x faster than state-of-the-art baselines. We achieve this speedup while generating comparable visual quality across different image resolutions and translation domains.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning

🧭 Keyword Pioneer — sinusoidal encoding

🐣 Hot Topic Early Bird — high-resolution image

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Tamar Rott Shaham , MICHAEL GHARBI , Richard Zhang , Eli Shechtman , Tomer Michaeli

Topics

Deep Learning > Models > Generative Models Computer Vision > Generation > Image Translation Deep Learning > Optimization & Theory > Efficient Computing Deep Learning > Architectures > Convolutional Neural Networks

Keywords

image generation image translation efficient computing generative model convolutional neural network high-resolution image spatial adaptation sinusoidal encoding generator architecture pixel-wise network spatially adaptive pixelwise network

Download PDF

Related papers

Learning To Reconstruct High Speed and High Dynamic Range Videos From Events 2021

DeFLOCNet: Deep Image Editing via Flexible Low-Level Controls 2021

Vx2Text: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs 2021

Coming Down to Earth: Satellite-to-Street View Synthesis for Geo-Localization 2021

Pose-Guided Human Animation From a Single Image in the Wild 2021