← Domain-Specific

Computer Vision › Domain-Specific ›

Egocentric Vision

436 directly classified papers

Papers per year

Papers

EgoGen: An Egocentric Synthetic Data Generator CVPR 2024

VideoGUI: A Benchmark for GUI Automation from Instructional Videos NIPS 2024

GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians CVPR 2024

CosmicMan: A Text-to-Image Foundation Model for Humans CVPR 2024

360+x: A Panoptic Multi-modal Scene Understanding Dataset CVPR 2024

IKEA Ego 3D Dataset: Understanding Furniture Assembly Actions From Ego-View 3D Point Clouds WACV 2024

BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation CVPR 2024

EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World CVPR 2024

Visual Redundancy Removal for Composite Images: A Benchmark Dataset and a Multi-Visual-Effects Driven Incremental Method AAAI 2024

Real-Time Simulated Avatar from Head-Mounted Sensors CVPR 2024

EventEgo3D: 3D Human Motion Capture from Egocentric Event Streams CVPR 2024

Summarize the Past to Predict the Future: Natural Language Descriptions of Context Boost Multimodal Object Interaction Anticipation CVPR 2024

Mocap Everyone Everywhere: Lightweight Motion Capture With Smartwatches and a Head-Mounted Camera CVPR 2024

Interaction Region Visual Transformer for Egocentric Action Anticipation WACV 2024

SPIDeRS: Structured Polarization for Invisible Depth and Reflectance Sensing CVPR 2024

LED: A Large-scale Real-world Paired Dataset for Event Camera Denoising CVPR 2024

3D Human Pose Perception from Egocentric Stereo Videos CVPR 2024

PREGO: Online Mistake Detection in PRocedural EGOcentric Videos CVPR 2024

MCD: Diverse Large-Scale Multi-Campus Dataset for Robot Perception CVPR 2024

From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations CVPR 2024

Egocentric Whole-Body Motion Capture with FisheyeViT and Diffusion-Based Motion Refinement CVPR 2024

Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding EMNLP 2024

Automated Monitoring of Ear Biting in Pigs by Tracking Individuals and Events WACV 2024

Fusing Personal and Environmental Cues for Identification and Segmentation of First-Person Camera Wearers in Third-Person Views CVPR 2024

X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization CVPR 2024