Geometry-Aware Deep Learning for 3D Skeleton-Based Motion Prediction
Abstract
The field of human motion prediction in computer vision faces challenges especially in 3D Skeleton-based Human Motion. Deep learning models albeit successful in most vision tasks were designed for data characterized by an underlying Euclidean structure which is not always fulfilled as pre-processed data may often reside in a non-linear space. Conventional RNNs struggle with capturing long-term dependencies in motion contexts. Our novel approach focuses on geometry-aware deep learning to predict the motion. We use a compact manifold-valued representation of 3D human skeleton motion integrating self-attention in transformer networks. This representation maps motions to points on a manifold ensuring smooth and coherent long-term predictions. Combining Kendall's shape space for non-rigid deformation and Lie group for rigid deformation provides a complete transformation. Experiments on various datasets demonstrate superiority over state-of-the-art methods in both short and long-term horizons.