Computer Vision › Analysis ›

Action Recognition

1421 directly classified papers

Papers per year

Papers

DVANet: Disentangling View and Action Features for Multi-View Action Recognition AAAI 2024

Dynamic Semantic-Based Spatial Graph Convolution Network for Skeleton-Based Human Action Recognition AAAI 2024

Error Detection in Egocentric Procedural Task Videos CVPR 2024

Part-aware Unified Representation of Language and Skeleton for Zero-shot Action Recognition CVPR 2024

MM-WLAuslan: Multi-View Multi-Modal Word-Level Australian Sign Language Recognition Dataset NIPS 2024

Multimodal Continuous Fingerspelling Recognition via Visual Alignment Learning INTERSPEECH 2024

SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token Synchronization INTERSPEECH 2024

From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations CVPR 2024

SignGraph: A Sign Sequence is Worth Graphs of Nodes CVPR 2024

End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames CVPR 2024

Adapting Short-Term Transformers for Action Detection in Untrimmed Videos CVPR 2024

MaskCLR: Attention-Guided Contrastive Learning for Robust Action Representation Learning CVPR 2024

Action Scene Graphs for Long-Form Understanding of Egocentric Videos CVPR 2024

Recovering Complete Actions for Cross-dataset Skeleton Action Recognition NIPS 2024

CoSTA: End-to-End Comprehensive Space-Time Entanglement for Spatio-Temporal Video Grounding AAAI 2024

Learning from Synthetic Human Group Activities CVPR 2024

HENASY: Learning to Assemble Scene-Entities for Interpretable Egocentric Video-Language Model NIPS 2024

FACT: Frame-Action Cross-Attention Temporal Modeling for Efficient Action Segmentation CVPR 2024

Evaluation of Video Masked Autoencoders' Performance and Uncertainty Estimations for Driver Action and Intention Recognition WACV 2024

Task-Driven Exploration: Decoupling and Inter-Task Feedback for Joint Moment Retrieval and Highlight Detection CVPR 2024

SRTube: Video-Language Pre-Training with Action-Centric Video Tube Features and Semantic Role Labeling CVPR 2024

Co-Speech Gesture Video Generation via Motion-Decoupled Diffusion Model CVPR 2024

Semantic-Aware Video Representation for Few-Shot Action Recognition WACV 2024

CHASE: Learning Convex Hull Adaptive Shift for Skeleton-based Multi-Entity Action Recognition NIPS 2024

Dual DETRs for Multi-Label Temporal Action Detection CVPR 2024