Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Models
Deep Learning
›
Models
›
Foundation Models
259 directly classified papers
Papers per year
2021: 5
2022: 13
2023: 23
2024: 104
2025: 109
2026: 5
Papers
OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning
ICCV 2025
Unified Multimodal Understanding via Byte-Pair Visual Encoding
ICCV 2025
F-Bench: Rethinking Human Preference Evaluation Metrics for Benchmarking Face Generation, Customization, and Restoration
ICCV 2025
SkySense V2: A Unified Foundation Model for Multi-modal Remote Sensing
ICCV 2025
FE-CLIP: Frequency Enhanced CLIP Model for Zero-Shot Anomaly Detection and Segmentation
ICCV 2025
RoboTron-Mani: All-in-One Multimodal Large Model for Robotic Manipulation
ICCV 2025
DH-FaceVid-1K: A Large-Scale High-Quality Dataset for Face Video Generation
ICCV 2025
Can Generative Geospatial Diffusion Models Excel as Discriminative Geospatial Foundation Models?
ICCV 2025
Correspondence as Video: Test-Time Adaption on SAM2 for Reference Segmentation in the Wild
ICCV 2025
Scaling Omni-modal Pretraining with Multimodal Context: Advancing Universal Representation Learning Across Modalities
ICCV 2025
TAViS: Text-bridged Audio-Visual Segmentation with Foundation Models
ICCV 2025
MaskSAM: Auto-prompt SAM with Mask Classification for Volumetric Medical Image Segmentation
ICCV 2025
Generalizable Object Re-Identification via Visual In-Context Prompting
ICCV 2025
Physics-Guided Foundation Model for Scientific Discovery: An Application to Aquatic Science
AAAI 2025
AoP-SAM: Automation of Prompts for Efficient Segmentation
AAAI 2025
HPSv3: Towards Wide-Spectrum Human Preference Score
ICCV 2025
Semi-supervised 3D Semantic Scene Completion with 2D Vision Foundation Model Guidance
AAAI 2025
Bridging the Gap between Brain and Machine in Interpreting Visual Semantics: Towards Self-adaptive Brain-to-Text Decoding
ICCV 2025
Seeing What Tastes Good: Revisiting Multimodal Distributional Semantics in the Billion Parameter Era
ACL 2025
Test-time Adaptation for Foundation Medical Segmentation Model Without Parametric Updates
ICCV 2025
Not all Views are Created Equal: Analyzing Viewpoint Instabilities in Vision Foundation Models
ICCV 2025
Towards Foundational Models for Single-Chip Radar
ICCV 2025
Controllable-LPMoE: Adapting to Challenging Object Segmentation via Dynamic Local Priors from Mixture-of-Experts
ICCV 2025
TerraMind: Large-Scale Generative Multimodality for Earth Observation
ICCV 2025
VisualEDU: A Benchmark for Assessing Coding and Visual Comprehension through Educational Problem-Solving Video Generation
EMNLP 2025
<
1
2
3
4
5
…
11
>