Computer Vision › Analysis ›

Action Recognition

1421 directly classified papers

Papers per year

Papers

Studying and Mitigating Biases in Sign Language Understanding Models EMNLP 2024

Exploring the Impact of Rendering Method and Motion Quality on Model Performance When Using Multi-View Synthetic Data for Action Recognition WACV 2024

End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames CVPR 2024

Multi-scale Dynamic and Hierarchical Relationship Modeling for Facial Action Units Recognition CVPR 2024

Efficient Vision-Language pre-training via domain-specific learning for human activities EMNLP 2024

3DInAction: Understanding Human Actions in 3D Point Clouds CVPR 2024

FACT: Frame-Action Cross-Attention Temporal Modeling for Efficient Action Segmentation CVPR 2024

HENASY: Learning to Assemble Scene-Entities for Interpretable Egocentric Video-Language Model NIPS 2024

ExACT: Language-guided Conceptual Reasoning and Uncertainty Estimation for Event-based Action Recognition and More CVPR 2024

Multiscale Vision Transformers Meet Bipartite Matching for Efficient Single-stage Action Localization CVPR 2024

CHASE: Learning Convex Hull Adaptive Shift for Skeleton-based Multi-Entity Action Recognition NIPS 2024

Scaling Up Dynamic Human-Scene Interaction Modeling CVPR 2024

Dual DETRs for Multi-Label Temporal Action Detection CVPR 2024

TR-DETR: Task-Reciprocal Transformer for Joint Moment Retrieval and Highlight Detection AAAI 2024

Dynamic Semantic-Based Spatial Graph Convolution Network for Skeleton-Based Human Action Recognition AAAI 2024

TACO: Benchmarking Generalizable Bimanual Tool-ACtion-Object Understanding CVPR 2024

Koala: Key Frame-Conditioned Long Video-LLM CVPR 2024

Multimodal Continuous Fingerspelling Recognition via Visual Alignment Learning INTERSPEECH 2024

MVHumanNet: A Large-scale Dataset of Multi-view Daily Dressing Human Captures CVPR 2024

Learning from Synthetic Human Group Activities CVPR 2024

Task-Driven Exploration: Decoupling and Inter-Task Feedback for Joint Moment Retrieval and Highlight Detection CVPR 2024

Uncertainty-aware Action Decoupling Transformer for Action Anticipation CVPR 2024

SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token Synchronization INTERSPEECH 2024

LLMs are Good Action Recognizers CVPR 2024

Co-Speech Gesture Video Generation via Motion-Decoupled Diffusion Model CVPR 2024