Do Pre-Trained Models Benefit Equally in Continual Learning?

Kuan-Ying Lee; Yuanyi Zhong; Yu-Xiong Wang

2023 WACV WACV 2023

Do Pre-Trained Models Benefit Equally in Continual Learning?

Abstract

A large part of the continual learning (CL) literature focuses on developing algorithms for models trained from scratch. While these algorithms work great with from-sc ratch trained models on widely used CL benchmarks, they show dramatic performance drops on more complex datasets (e.g., Split-CUB200). Pre-trained models, widely used to transfer knowledge to downstream tasks, could enhance these methods to be applicable in more realistic scenarios. However, surprisingly, improvements in CL algorithms from pre-training are inconsistent. For instance, while Incremental Classifier and Representation Learning (iCaRL) underperforms Supervised Contrastive Replay (SCR) when trained from scratch, it outperforms SCR when both are initialized with a pre-trained model. This indicates the paradigm current CL literature follows, where all methods are compared in from-scratch training, is not well reflective of the true CL objective and desired progress. Furthermore, we found 1) CL algorithms that exert less regularization benefit more from a pre-trained model; 2) a model pre-trained with a larger dataset (WebImageText in Contrastive Language-Image Pre-training (CLIP) vs. ImageNet) does not guarantee a better improvement. Based on these findings, we introduced a simple yet effective baseline that employs minimum regularization and leverages the more beneficial pre-trained model, which outperforms state-of-the-art methods when pre-training is applied. Our code is available at https://github.com/eric11220/pretrained-models-in-CL.

❓ The Questioner

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🧭 Keyword Pioneer — supervised contrastive replay

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio