CAME: Contrastive Automated Model Evaluation

Ru Peng; Qiuyang Duan; Haobo Wang; Jiachen Ma; Yanbo Jiang; Yongjun Tu; Xiu Jiang; Junbo Zhao

2023 ICCV ICCV 2023

CAME: Contrastive Automated Model Evaluation

Abstract

The Automated Model Evaluation (AutoEval) framework entertains the possibility of evaluating a trained machine learning model without resorting to a labeled testing set. Despite the promise and some decent results, the existing AutoEval methods heavily rely on computing distribution shifts between the unlabelled testing set and the training set. We believe this reliance on the training set becomes another obstacle in shipping this technology to real-world ML development. In this work, we propose Contrastive Automatic Model Evaluation (CAME), a novel AutoEval framework that is rid of involving training set in the loop. The core idea of CAME bases on a theoretical analysis which bonds the model performance with a contrastive loss. Further, with extensive empirical validation, we manage to set up a predictable relationship between the two, simply by deducing on the unlabeled/unseen testing set. The resulting framework CAME establishes a new SOTA results for AutoEval by surpassing prior work significantly.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🧭 Keyword Pioneer — automated model evaluation

🐣 Hot Topic Early Bird — model performance

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ru Peng , Qiuyang Duan , Haobo Wang , Jiachen Ma , Yanbo Jiang , Yongjun Tu , Xiu Jiang , Junbo Zhao

Topics

Artificial Intelligence > Core AI > Interpretability Machine Learning > Learning Types > Contrastive Learning Machine Learning > Optimization & Theory > Theory

Keywords

contrastive learning distribution shift model performance automated model evaluation unlabeled testing set

Download PDF

Related papers

PVT++: A Simple End-to-End Latency-Aware Visual Tracking Framework 2023

Periodically Exchange Teacher-Student for Source-Free Object Detection 2023

Stable and Causal Inference for Discriminative Self-supervised Deep Visual Representations 2023

Minimal Solutions to Uncalibrated Two-view Geometry with Known Epipoles 2023

3D Neural Embedding Likelihood: Probabilistic Inverse Graphics for Robust 6D Pose Estimation 2023