Fine-Tuning and Prompt Optimization: Two Great Steps that Work Better Together

Dilara Soylu; Christopher Potts; Omar Khattab

2024 EMNLP EMNLP 2024

Fine-Tuning and Prompt Optimization: Two Great Steps that Work Better Together

Abstract

AbstractNatural Language Processing (NLP) systems are increasingly taking the form of sophisticated modular pipelines, e.g., Retrieval Augmented Generation (RAG), where each module may involve a distinct Language Model (LM) and an associated prompt template. These compound systems often lack intermediate labels or gradient flow to optimize each module, making their end-to-end optimization challenging. Here we seek strategies to optimize both the module-level LM weights and the associated prompt templates of such systems to maximize a downstream task metric. We propose for the first time combining the weight and prompt optimization strategies to optimize a modular LM pipeline by alternating between the two to get the same LM to teach itself. In experiments with multi-hop QA, mathematical reasoning, and feature-based classification using mistral-7b, llama-2-7b, and llama-3-8b, these BetterTogether strategies optimizing the weights and prompts of a pipeline together outperform directly optimizing weights alone and prompts alone by up to 60% and 6%, respectively, on average across LMs and tasks. Our BetterTogether optimizer is released in DSPy at [http://dspy.ai](http://dspy.ai).

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Dilara Soylu , Christopher Potts , Omar Khattab

Topics

Artificial Intelligence > Learning Paradigms > Transfer Learning Machine Learning > Optimization & Theory > Optimization Natural Language Processing > Resources & Methods > Large Language Models Artificial Intelligence > Core AI > Large Language Models Machine Learning > Learning Types > Fine-Tuning Deep Learning > Optimization & Theory > Optimization Natural Language Processing > Generation > Retrieval-Augmented Generation Machine Learning > Learning Types > Prompt Engineering Deep Learning > Techniques > Fine-Tuning Deep Learning > Techniques > Prompt Engineering

Keywords

retrieval augmented generation prompt engineering language model prompt optimization retrieval-augmented generation weight optimization modular pipeline

Download PDF

Related papers

EmbodiedBERT: Cognitively Informed Metaphor Detection Incorporating Sensorimotor Information 2024

Mitigating Matthew Effect: Multi-Hypergraph Boosted Multi-Interest Self-Supervised Learning for Conversational Recommendation 2024

Learning to Extract Structured Entities Using Language Models 2024

Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis 2024

CSSL: Contrastive Self-Supervised Learning for Dependency Parsing on Relatively Free Word Ordered and Morphologically Rich Low Resource Languages 2024