2025 IJCAI IJCAI 2025

In-context Learning Demonstration Generation with Text Distillation

Abstract

In-context learning (ICL), a paradigm derived from large language models (LLMs), holds significant promise but is notably sensitive to the choice of input demonstrations. While numerous methodologies have been developed to select the optimal demonstrations from existing datasets, our work alternatively proposes to generate representative demonstrations through a Distillation-based Demonstration Generation (DDG) framework. Specifically, our approach aims to generate demonstrations that encapsulate the essential attributes of the target dataset. Rather than optimizing these demonstrations directly, we design a generative model and try to refine it by minimizing the discrepancies between the calculative models trained on generated demonstrations and the original datasets respectively. Additionally, we leverage a teacher-student framework to stabilize the training process and improve the quality of the synthesized samples. Extensive experiments conducted across ten prevalent text datasets demonstrate that our DDG method substantially outperforms existing state-of-the-art methodologies. Our code will be available at https://github.com/wwyq1/DDG.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing
🧭 Keyword Pioneer — demonstration generation
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio