2024 CVPR CVPR 2024

FlowDiffuser: Advancing Optical Flow Estimation with Diffusion Models

Abstract

Optical flow estimation a process of predicting pixel-wise displacement between consecutive frames has commonly been approached as a regression task in the age of deep learning. Despite notable advancements this de facto paradigm unfortunately falls short in generalization performance when trained on synthetic or constrained data. Pioneering a paradigm shift we reformulate optical flow estimation as a conditional flow generation challenge unveiling FlowDiffuser --- a new family of optical flow models that could have stronger learning and generalization capabilities. FlowDiffuser estimates optical flow through a `noise-to-flow' strategy progressively eliminating noise from randomly generated flows conditioned on the provided pairs. To optimize accuracy and efficiency our FlowDiffuser incorporates a novel Conditional Recurrent Denoising Decoder (Conditional-RDD) streamlining the flow estimation process. It incorporates a unique Hidden State Denoising (HSD) paradigm effectively leveraging the information from previous time steps. Moreover FlowDiffuser can be easily integrated into existing flow networks leading to significant improvements in performance metrics compared to conventional implementations. Experiments on challenging benchmarks including Sintel and KITTI demonstrate the effectiveness of our FlowDiffuser with superior performance to existing state-of-the-art models. Code is available at https://github.com/LA30/FlowDiffuser.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning
🧭 Keyword Pioneer — pixel-wise displacement
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio