TeSTNeRF: Text-Driven 3D Style Transfer via Cross-Modal Learning

Jiafu Chen; Boyan Ji; Zhanjie Zhang; Tianyi Chu; Zhiwen Zuo; lei zhao; Wei Xing; Dongming Lu

2023 IJCAI IJCAI 2023

TeSTNeRF: Text-Driven 3D Style Transfer via Cross-Modal Learning

Abstract

Text-driven 3D style transfer aims at stylizing a scene according to the text and generating arbitrary novel views with consistency. Simply combining image/video style transfer methods and novel view synthesis methods results in flickering when changing viewpoints, while existing 3D style transfer methods learn styles from images instead of texts. To address this problem, we for the first time design an efficient text-driven model for 3D style transfer, named TeSTNeRF, to stylize the scene using texts via cross-modal learning: we leverage an advanced text encoder to embed the texts in order to control 3D style transfer and align the input text and output stylized images in latent space. Furthermore, to obtain better visual results, we introduce style supervision, learning feature statistics from style images and utilizing 2D stylization results to rectify abrupt color spill. Extensive experiments demonstrate that TeSTNeRF significantly outperforms existing methods and provides a new way to guide 3D style transfer.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision

📈 Trend Setter — Image Editing

🧭 Keyword Pioneer — 3d style transfer

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Deep Learning, Machine Learning, Natural Language Processing, Speech & Audio

🐣 Hot Topic Early Bird — text encoder

Authors

Jiafu Chen , Boyan Ji , Zhanjie Zhang , Tianyi Chu , Zhiwen Zuo , lei zhao , Wei Xing , Dongming Lu

Topics

Artificial Intelligence > Core AI > Multimodal Learning Computer Vision > Analysis > 3D Vision Computer Vision > Generation > Image Generation Computer Vision > Processing > Image Editing Deep Learning > Learning Types > Multi-Modal Learning

Keywords

image generation style transfer cross-modal learning neural radiance field novel view synthesis text encoder latent space alignment 3d style transfer image stylization

Download PDF

Related papers

Analyzing Intentional Behavior in Autonomous Agents under Uncertainty 2023

Deep Hashing-based Dynamic Stock Correlation Estimation via Normalizing Flow 2023

U-Match: Two-view Correspondence Learning with Hierarchy-aware Local Context Aggregation 2023

Artificial Agents Inspired by Human Motivation Psychology for Teamwork in Hazardous Environments 2023

Proportionally Fair Online Allocation of Public Goods with Predictions 2023