Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Processing
Computer Vision
›
Processing
›
Video Understanding
1592 directly classified papers
Papers per year
2006: 1
2012: 1
2013: 30
2014: 15
2015: 38
2016: 22
2017: 39
2018: 49
2019: 91
2020: 115
2021: 207
2022: 160
2023: 254
2024: 216
2025: 297
2026: 57
Papers
A Memory Network Approach for Story-Based Temporal Summarization of 360° Videos
CVPR 2018
Excitation Backprop for RNNs
CVPR 2018
Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition
CVPR 2018
DVQA: Understanding Data Visualizations via Question Answering
CVPR 2018
Object Referring in Videos With Language and Human Gaze
CVPR 2018
Finding "It": Weakly-Supervised Reference-Aware Visual Grounding in Instructional Videos
CVPR 2018
Rethinking Diversified and Discriminative Proposal Generation for Visual Grounding
IJCAI 2018
Multi-modal Circulant Fusion for Video-to-Language and Backward
IJCAI 2018
Trajectory Convolution for Action Recognition
NIPS 2018
Layered Optical Flow Estimation Using a Deep Neural Network with a Soft Mask
IJCAI 2018
Localizing Moments in Video with Temporal Language
EMNLP 2018
Dilated Convolutional Network with Iterative Optimization for Continuous Sign Language Recognition
IJCAI 2018
Evaluating and Complementing Vision-to-Language Technology for People who are Blind with Conversational Crowdsourcing
IJCAI 2018
Action Sets: Weakly Supervised Action Segmentation Without Ordering Constraints
CVPR 2018
Hierarchical Long-term Video Prediction without Supervision
ICML 2018
Temporally Grounding Natural Sentence in Video
EMNLP 2018
Reconstruction Network for Video Captioning
CVPR 2018
Learning Latent Super-Events to Detect Multiple Activities in Videos
CVPR 2018
Cube Padding for Weakly-Supervised Saliency Prediction in 360° Videos
CVPR 2018
Fast Video Object Segmentation by Reference-Guided Mask Propagation
CVPR 2018
Real-World Repetition Estimation by Div, Grad and Curl
CVPR 2018
Spatiotemporal Multiplier Networks for Video Action Recognition
CVPR 2017
Visual, Laughter, Applause and Spoken Expression Features for Predicting Engagement Within TED Talks
INTERSPEECH 2017
Speaker Dependency Analysis, Audiovisual Fusion Cues and a Multimodal BLSTM for Conversational Engagement Recognition
INTERSPEECH 2017
The VQA-Machine: Learning How to Use Existing Vision Algorithms to Answer New Questions
CVPR 2017
<
1
…
57
58
59
…
64
>