Where to start? Analyzing the potential value of intermediate models

Leshem Choshen; Elad Venezian; Shachar Don-Yehiya; Noam Slonim; Yoav Katz

2023 EMNLP EMNLP 2023

Where to start? Analyzing the potential value of intermediate models

Abstract

AbstractPrevious studies observed that finetuned models may be better base models than the vanilla pretrained model. Such a model, finetuned on some source dataset, may provide a better starting point for a new finetuning process on a desired target dataset. Here, we perform a systematic analysis of this intertraining scheme, over a wide range of English classification tasks. Surprisingly, our analysis suggests that the potential intertraining gain can be analyzed independently for the target dataset under consideration, and for a base model being considered as a starting point. Hence, a performant model is generally strong, even if its training data was not aligned with the target dataset. Furthermore, we leverage our analysis to propose a practical and efficient approach to determine if and how to select a base model in real-world settings. Last, we release an updating ranking of best models in the HuggingFace hub per architecture.

❓ The Questioner

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — intermediate model

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Leshem Choshen , Elad Venezian , Shachar Don-Yehiya , Noam Slonim , Yoav Katz

Topics

Machine Learning > Learning Types > Transfer Learning Natural Language Processing > Resources & Methods > Transfer Learning Machine Learning > Learning Types > Fine-Tuning Machine Learning > Learning Paradigms > Multi-Task Learning Deep Learning > Learning Types > Transfer Learning Deep Learning > Learning Types > Fine-Tuning

Keywords

model selection transfer learning text classification pretrained model classification task model reuse intermediate model

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023