2024 CVPR CVPR 2024

Lodge: A Coarse to Fine Diffusion Network for Long Dance Generation Guided by the Characteristic Dance Primitives

Abstract

We propose Lodge a network capable of generating extremely long dance sequences conditioned on given music. We design Lodge as a two-stage coarse to fine diffusion architecture and propose the characteristic dance primitives that possess significant expressiveness as intermediate representations between two diffusion models. The first stage is global diffusion which focuses on comprehending the coarse-level music-dance correlation and production characteristic dance primitives. In contrast the second-stage is the local diffusion which parallelly generates detailed motion sequences under the guidance of the dance primitives and choreographic rules. In addition we propose a Foot Refine Block to optimize the contact between the feet and the ground enhancing the physical realism of the motion. Code available at https://li-ronghui.github.io/lodge

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Science and Computer Vision and Deep Learning
🧭 Keyword Pioneer — music-driven generation
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio