FlowDiffuser: Advancing Optical Flow Estimation with Diffusion Models

Ao Luo; Xin Li; Fan Yang; Jiangyu Liu; Haoqiang Fan; Shuaicheng Liu

2024 CVPR CVPR 2024

FlowDiffuser: Advancing Optical Flow Estimation with Diffusion Models

Abstract

Optical flow estimation a process of predicting pixel-wise displacement between consecutive frames has commonly been approached as a regression task in the age of deep learning. Despite notable advancements this de facto paradigm unfortunately falls short in generalization performance when trained on synthetic or constrained data. Pioneering a paradigm shift we reformulate optical flow estimation as a conditional flow generation challenge unveiling FlowDiffuser --- a new family of optical flow models that could have stronger learning and generalization capabilities. FlowDiffuser estimates optical flow through a `noise-to-flow' strategy progressively eliminating noise from randomly generated flows conditioned on the provided pairs. To optimize accuracy and efficiency our FlowDiffuser incorporates a novel Conditional Recurrent Denoising Decoder (Conditional-RDD) streamlining the flow estimation process. It incorporates a unique Hidden State Denoising (HSD) paradigm effectively leveraging the information from previous time steps. Moreover FlowDiffuser can be easily integrated into existing flow networks leading to significant improvements in performance metrics compared to conventional implementations. Experiments on challenging benchmarks including Sintel and KITTI demonstrate the effectiveness of our FlowDiffuser with superior performance to existing state-of-the-art models. Code is available at https://github.com/LA30/FlowDiffuser.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning

🧭 Keyword Pioneer — pixel-wise displacement

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ao Luo , Xin Li , Fan Yang , Jiangyu Liu , Haoqiang Fan , Shuaicheng Liu

Topics

Machine Learning > Core Methods > Regression Deep Learning > Models > Diffusion Models Computer Vision > Processing > Video Processing Computer Vision > Analysis > Motion Estimation Computer Vision > Analysis > Computer Vision

Keywords

motion estimation conditional generation optical flow estimation diffusion model video processing pixel-wise displacement denoising decoder conditional denoising pixel displacement

Download PDF

Related papers

DUSt3R: Geometric 3D Vision Made Easy 2024

Bezier Everywhere All at Once: Learning Drivable Lanes as Bezier Graphs 2024

NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows 2024

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization 2024

DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models 2024