2026 AAAI AAAI 2026

MotivDance: Fine-Grained Text-Guided Motivation Choreography with Music Synchronization

Abstract

Abstract Realistic choreography demands simultaneous attention to rhythm and motivation. Prevailing automated dance generation methods mainly depend on musical input, overlooking the motivations that drive meaningful dance creation. Inspired by the motivation choreography, we aim to articulate dance motivations through textual guidance. However, the absence of high-quality datasets concurrently containing music, textual descriptions, and motion data presents a challenge in achieving accurate fine-grained textual control. To address this limitation, we present MotivDance, a novel framework integrating fine-grained textual guidance with music to synthesize semantically coherent dance sequences. Our approach first synthesizes text-guided key poses as motivations. We then introduce an Adaptive Keyframe Locator that dynamically positions these motivations within the musical context through beat-aware synchronization and cross-modal latent space alignment. Finally, a Transformer-based U-Net diffusion model performs the motion in-betweening while preserving motivational integrity. Extensive qualitative and quantitative experiments demonstrate that MotivDance effectively integrates music with fine-grained text control to generate high-fidelity dance motions.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning
🧭 Keyword Pioneer — text-guided choreography
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio