Demystifying Prompts in Language Models via Perplexity Estimation

Hila Gonen; Srini Iyer; Terra Blevins; Noah Smith; Luke Zettlemoyer

2023 EMNLP EMNLP 2023

Demystifying Prompts in Language Models via Perplexity Estimation

Abstract

AbstractLanguage models can be prompted to perform a wide variety of tasks with zero- and few-shot in-context learning. However, performance varies significantly with the choice of prompt, and we do not yet understand why this happens. In this paper, we analyze the factors that contribute to this variance and establish a new empirical hypothesis: the performance of a prompt is predicted by the extent to which the model is familiar with the language it contains. Over a wide range of tasks, we show that the lower the perplexity of the prompt, the better it is able to perform the task, when considering reasonable prompts that are related to it. As part of our analysis, we also devise a method to automatically extend a small seed set of manually written prompts by paraphrasing with GPT3 and backtranslation. This larger set allows us to verify that perplexity is a strong predictor of the success of a prompt and we show that the lowest perplexity prompts are consistently effective.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — perplexity estimation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Hila Gonen , Srini Iyer , Terra Blevins , Noah Smith , Luke Zettlemoyer

Topics

Natural Language Processing > Generation > Language Modeling Natural Language Processing > Resources & Methods > Large Language Models Machine Learning > Learning Types > Few-Shot Learning Artificial Intelligence > Core AI > Large Language Models Machine Learning > Learning Types > In-Context Learning Deep Learning > Optimization & Theory > Neural Network Optimization Deep Learning > Learning Types > Zero-Shot Learning Deep Learning > Learning Types > In-Context Learning

Keywords

zero-shot learning few-shot learning in-context learning prompt engineering language model perplexity estimation

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023