Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Models
Deep Learning
›
Models
›
Foundation Models
259 directly classified papers
Papers per year
2021: 5
2022: 13
2023: 23
2024: 104
2025: 109
2026: 5
Papers
VisualEDU: A Benchmark for Assessing Coding and Visual Comprehension through Educational Problem-Solving Video Generation
EMNLP 2025
Advancing Myopia To Holism: Fully Contrastive Language-Image Pre-training
CVPR 2025
Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion
CVPR 2025
CLIPErase: Efficient Unlearning of Visual-Textual Associations in CLIP
ACL 2025
MM-OR: A Large Multimodal Operating Room Dataset for Semantic Understanding of High-Intensity Surgical Environments
CVPR 2025
From Head to Tail: Towards Balanced Representation in Large Vision-Language Models through Adaptive Data Calibration
CVPR 2025
VL-RewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models
CVPR 2025
FLARE: A Framework for Stellar Flare Forecasting Using Stellar Physical Properties and Historical Records
IJCAI 2025
AA-CLIP: Enhancing Zero-Shot Anomaly Detection via Anomaly-Aware CLIP
CVPR 2025
AoP-SAM: Automation of Prompts for Efficient Segmentation
AAAI 2025
ViPOcc: Leveraging Visual Priors from Vision Foundation Models for Single-View 3D Occupancy Prediction
AAAI 2025
Foundation Model Driven Appearance Extraction for Robust Multiple Object Tracking
AAAI 2025
Towards a Comprehensive, Efficient and Promptable Anatomic Structure Segmentation Model Using 3D Whole-Body CT Scans
AAAI 2025
Boosting Segment Anything Model Towards Open-Vocabulary Learning
AAAI 2025
SAUGE: Taming SAM for Uncertainty-Aligned Multi-Granularity Edge Detection
AAAI 2025
Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing Community
AAAI 2025
Semi-supervised 3D Semantic Scene Completion with 2D Vision Foundation Model Guidance
AAAI 2025
CLIP-MSM: A Multi-Semantic Mapping Brain Representation for Human High-Level Visual Cortex
AAAI 2025
Unified Coding for Both Human Perception and Generalized Machine Analytics with CLIP Supervision
AAAI 2025
Building a Multi-modal Spatiotemporal Expert for Zero-shot Action Recognition with CLIP
AAAI 2025
Multifaceted User Modeling in Recommendation: A Federated Foundation Models Approach
AAAI 2025
Federated Foundation Models on Heterogeneous Time Series
AAAI 2025
SegAnyPET: Universal Promptable Segmentation from Positron Emission Tomography Images
ICCV 2025
Enhancing Prompt Generation with Adaptive Refinement for Camouflaged Object Detection
ICCV 2025
Tartan IMU: A Light Foundation Model for Inertial Positioning in Robotics
CVPR 2025
<
1
2
3
4
5
…
11
>