Beyond Fine-tuning: Unleashing the Potential of Continuous Pretraining for Clinical LLMs.

Clément Christophe; Tathagata Raha; Svetlana Maslenkova; Muhammad Umar Salman; Praveenkumar Kanithi; Marco AF Pimentel; Shadab Khan

2024 EMNLP EMNLP 2024

Beyond Fine-tuning: Unleashing the Potential of Continuous Pretraining for Clinical LLMs.

Abstract

AbstractLarge Language Models (LLMs) have demonstrated significant potential in revolutionizing clinical applications. In this study, we investigate the efficacy of four techniques in adapting LLMs for clinical use-cases: continuous pretraining, instruct fine-tuning, NEFTune, and prompt engineering. We employ these methods on Mistral 7B and Mixtral 8x7B models, leveraging a large-scale clinical pretraining dataset of 50 billion tokens and an instruct fine-tuning dataset of 500 million tokens. Our evaluation across various clinical tasks reveals nuanced insights. While continuous pretraining beyond 250 billion tokens yields marginal improvements, instruct fine-tuning emerges as a more influential factor. Notably, NEFTune, designed primarily to enhance generation quality, surprisingly demonstrates additional gains on our benchmark. These findings underscore the importance of tailoring fine-tuning strategies and exploring innovative techniques to optimize LLM performance in the clinical domain.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Healthcare & Medicine and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — instruct fine-tuning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Clément Christophe , Tathagata Raha , Svetlana Maslenkova , Muhammad Umar Salman , Praveenkumar Kanithi , Marco AF Pimentel , Shadab Khan

Topics

Artificial Intelligence > Learning Paradigms > Transfer Learning Machine Learning > Optimization & Theory > Neural Network Optimization Natural Language Processing > Resources & Methods > Large Language Models Healthcare & Medicine > Clinical > Medical AI Deep Learning > Learning Types > Fine-Tuning

Keywords

domain adaptation prompt engineering instruction fine-tuning medical domain continuous pretraining large language model instruct fine-tuning clinical llm

Download PDF

Related papers

EmbodiedBERT: Cognitively Informed Metaphor Detection Incorporating Sensorimotor Information 2024

Mitigating Matthew Effect: Multi-Hypergraph Boosted Multi-Interest Self-Supervised Learning for Conversational Recommendation 2024

Learning to Extract Structured Entities Using Language Models 2024

Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis 2024

CSSL: Contrastive Self-Supervised Learning for Dependency Parsing on Relatively Free Word Ordered and Morphologically Rich Low Resource Languages 2024