Explaining Data Patterns in Natural Language with Language Models

Chandan Singh; John X. Morris; Jyoti Aneja; Alexander Rush; Jianfeng Gao

2023 EMNLP EMNLP 2023

Explaining Data Patterns in Natural Language with Language Models

Abstract

AbstractLarge language models (LLMs) have displayed an impressive ability to harness natural language to perform complex tasks. We explore whether we can leverage this ability to find and explain patterns in data. Specifically, given a pre-trained LLM and data examples, we apply interpretable autoprompting (iPrompt) to generate a natural language string explaining the data. iPrompt iteratively generates explanations with an LLM and reranks them based on their performance when used as a prompt. Experiments on a wide range of datasets, from synthetic mathematics to natural language understanding, show that iPrompt can yield meaningful insights by accurately finding dataset explanations that are human-interpretable. Moreover, iPrompt is reasonably efficient, as it does not require access to model gradients and works with relatively small models (e.g. ~6 billion parameters rather than >=100 billion). Finally, experiments with scientific datasets show the potential for iPrompt to aid in scientific discovery.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Natural Language Processing

🧭 Keyword Pioneer — interpretable autoprompting

🐣 Hot Topic Early Bird — scientific discovery

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Chandan Singh , John X. Morris , Jyoti Aneja , Alexander Rush , Jianfeng Gao

Topics

Artificial Intelligence > Core AI > Interpretability Natural Language Processing > Understanding > Semantic Analysis Natural Language Processing > Generation > Language Modeling Artificial Intelligence > Core AI > Large Language Models

Keywords

prompt engineering prompt generation scientific discovery natural language explanation pattern discovery large language model interpretable autoprompting human-interpretable explanation data pattern

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023