Locality-Aware Generalizable Implicit Neural Representation

Doyup Lee; Chiheon Kim; Minsu Cho; WOOK SHIN HAN

2023 NIPS NeurIPS 2023

Locality-Aware Generalizable Implicit Neural Representation

Abstract

Generalizable implicit neural representation (INR) enables a single continuous function, i.e., a coordinate-based neural network, to represent multiple data instances by modulating its weights or intermediate features using latent codes. However, the expressive power of the state-of-the-art modulation is limited due to its inability to localize and capture fine-grained details of data entities such as specific pixels and rays. To address this issue, we propose a novel framework for generalizable INR that combines a transformer encoder with a locality-aware INR decoder. The transformer encoder predicts a set of latent tokens from a data instance to encode local information into each latent token. The locality-aware INR decoder extracts a modulation vector by selectively aggregating the latent tokens via cross-attention for a coordinate input and then predicts the output by progressively decoding with coarse-to-fine modulation through multiple frequency bandwidths. The selective token aggregation and the multi-band feature modulation enable us to learn locality-aware representation in spatial and spectral aspects, respectively. Our framework significantly outperforms previous generalizable INRs and validates the usefulness of the locality-aware latents for downstream tasks such as image generation.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning

🧭 Keyword Pioneer — locality-aware modulation

🐣 Hot Topic Early Bird — cross-attention mechanism

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Doyup Lee , Chiheon Kim , Minsu Cho , WOOK SHIN HAN

Topics

Deep Learning > Architectures > Transformers Deep Learning > Architectures > Neural Networks Computer Vision > Generation > Image Generation Deep Learning > Learning Types > Self-Supervised Learning

Keywords

image generation implicit neural representation cross-attention mechanism transformer encoder latent token coordinate-based network locality-aware modulation

Download PDF

Related papers

Risk-Averse Model Uncertainty for Distributionally Robust Safe Reinforcement Learning 2023

Generative Modeling through the Semi-dual Formulation of Unbalanced Optimal Transport 2023

Self-Supervised Motion Magnification by Backpropagating Through Optical Flow 2023

Diffused Task-Agnostic Milestone Planner 2023

Characterizing Graph Datasets for Node Classification: Homophily-Heterophily Dichotomy and Beyond 2023