ReplaceAnything3D: Text-Guided Object Replacement in 3D Scenes with Compositional Scene Representations

Edward Bartrum; Thu Nguyen-Phuoc; Chris Xie; Zhengqin Li; Numair Khan; Armen Avetisyan; Douglas Lanman; Lei Xiao

2024 NIPS NeurIPS 2024

ReplaceAnything3D: Text-Guided Object Replacement in 3D Scenes with Compositional Scene Representations

Abstract

We introduce ReplaceAnything3D model RAM3D, a novel method for 3D object replacement in 3D scenes based on users' text description. Given multi-view images of a scene, a text prompt describing the object to replace, and another describing the new object, our Erase-and-Replace approach can effectively swap objects in 3D scenes with newly generated content while maintaining 3D consistency across multiple viewpoints. We demonstrate the versatility of RAM3D by applying it to various realistic 3D scene types, showcasing results of modified objects that blend in seamlessly with the scene without impacting its overall integrity.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Deep Learning

🧭 Keyword Pioneer — object swapping

🐣 Hot Topic Early Bird — scene representation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors

Edward Bartrum , Thu Nguyen-Phuoc , Chris Xie , Zhengqin Li , Numair Khan , Armen Avetisyan , Douglas Lanman , Lei Xiao

Topics

Artificial Intelligence > Core AI > Multimodal Learning Artificial Intelligence > Core AI > Procedural Generation Computer Vision > Processing > Image Processing Deep Learning > Learning Types > Multi-Modal Learning Computer Vision > Generation > 3D Generation

Keywords

scene representation multi-view image object swapping text-guided generation 3d scene 3d consistency 3d object replacement object replacement

Download PDF

Related papers

SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers 2024

Training for Stable Explanation for Free 2024

NeuralSolver: Learning Algorithms For Consistent and Efficient Extrapolation Across General Tasks 2024

Expectation Alignment: Handling Reward Misspecification in the Presence of Expectation Mismatch 2024

MicroAdam: Accurate Adaptive Optimization with Low Space Overhead and Provable Convergence 2024