Tactile-Augmented Radiance Fields

Yiming Dou; Fengyu Yang; Yi Liu; Antonio Loquercio; Andrew Owens

2024 CVPR CVPR 2024

Tactile-Augmented Radiance Fields

Abstract

We present a scene representation that brings vision and touch into a shared 3D space which we call a tactile-augmented radiance field. This representation capitalizes on two key insights: (i) ubiquitous vision-based touch sensors are built on perspective cameras and (ii) visually and structurally similar regions of a scene share the same tactile features. We use these insights to train a conditional diffusion model that provided with an RGB image and a depth map rendered from a neural radiance field generates its corresponding tactile "image". To train this diffusion model we collect the largest collection of spatially-aligned visual and tactile data. Through qualitative and quantitative experiments we demonstrate the accuracy of our cross-modal generative model and the utility of collected and rendered visual-tactile pairs across a range of downstream tasks. Project page: https://dou-yiming.github.io/TaRF

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning

🧭 Keyword Pioneer — vision-touch fusion

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Yiming Dou , Fengyu Yang , Yi Liu , Antonio Loquercio , Andrew Owens

Topics

Machine Learning > Core Methods > Representation Learning Deep Learning > Models > Diffusion Models Computer Vision > Generation > Image Generation Computer Vision > Core AI > Multimodal Learning Deep Learning > Learning Types > Multi-Modal Learning Computer Vision > Processing > 3D Vision

Keywords

multimodal learning diffusion model neural radiance field 3d scene representation tactile sensing depth map cross-modal generation conditional diffusion model tactile perception vision-touch fusion

Download PDF

Related papers

DUSt3R: Geometric 3D Vision Made Easy 2024

Bezier Everywhere All at Once: Learning Drivable Lanes as Bezier Graphs 2024

NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows 2024

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization 2024

DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models 2024