Deep Learning › Techniques ›

Self-Supervised Learning

323 directly classified papers

Papers per year

Papers

Language-only Training of Zero-shot Composed Image Retrieval CVPR 2024

UniPAD: A Universal Pre-training Paradigm for Autonomous Driving CVPR 2024

Low-Res Leads the Way: Improving Generalization for Super-Resolution by Self-Supervised Learning CVPR 2024

VoCo: A Simple-yet-Effective Volume Contrastive Learning Framework for 3D Medical Image Analysis CVPR 2024

The reasonable effectiveness of speaker embeddings for violence detection INTERSPEECH 2024

An Entropy-based Text Watermarking Detection Method ACL 2024

Masked Autoencoders for Microscopy are Scalable Learners of Cellular Biology CVPR 2024

ES3: Evolving Self-Supervised Learning of Robust Audio-Visual Speech Representations CVPR 2024

EAGLE: Eigen Aggregation Learning for Object-Centric Unsupervised Semantic Segmentation CVPR 2024

LAFS: Landmark-based Facial Self-supervised Learning for Face Recognition CVPR 2024

VideoMAC: Video Masked Autoencoders Meet ConvNets CVPR 2024

Differentiable Task Graph Learning: Procedural Activity Representation and Online Mistake Detection from Egocentric Videos NIPS 2024

Annotation-Free Audio-Visual Segmentation WACV 2024

Towards Understanding How Transformers Learn In-context Through a Representation Learning Lens NIPS 2024

Learning Degradation-Independent Representations for Camera ISP Pipelines CVPR 2024

OpenGaussian: Towards Point-Level 3D Gaussian-based Open Vocabulary Understanding NIPS 2024

Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis CVPR 2024

DETAIL: Task DEmonsTration Attribution for Interpretable In-context Learning NIPS 2024

T-VSL: Text-Guided Visual Sound Source Localization in Mixtures CVPR 2024

DanceMVP: Self-Supervised Learning for Multi-Task Primitive-Based Dance Performance Assessment via Transformer Text Prompting AAAI 2024

What Do Language Models Hear? Probing for Auditory Representations in Language Models ACL 2024

A General Protocol to Probe Large Vision Models for 3D Physical Understanding NIPS 2024

Depth-aware Test-Time Training for Zero-shot Video Object Segmentation CVPR 2024

Self-Guided Masked Autoencoder NIPS 2024

EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything CVPR 2024