2025 IJCNLP IJCNLP 2025

STAR: Self-Automated Back-Querying for Production Data Generation

Abstract

AbstractThe pervasiveness of large language models (LLMs) in enterprise settings has also brought forth a significant amount of risks associated with their usage. Guardrails technologies aim to mitigate this risk by filtering LLMs’ input/output text through various detectors. However, developing and maintaining robust detectors has many challenges, one of which is the difficulty in acquiring production-quality labeled data on real LLM outputs before deployment. In this work, we propose STAR, a simple yet intuitive solution to generate production-like labeled data for LLMs’ guardrails development. STAR is based on two key ideas: (i) using self-automated back-querying to synthetically generate data, paired with (ii) a sparse human-in-the-loop clustering technique to label the data. The aim of self-automated back-querying is to construct a parallel corpus roughly representative of the original dataset and resembling real LLM output. We then infuse existing datasets with our synthetically generated examples to produce robust training data for our detectors. We test our technique on one of the most difficult and nuanced detectors: the identification of health advice in LLM output, and demonstrate improvement versus other solutions. Our detector is able to outperform GPT-4o by up to 3.48%, despite having 400x less parameters.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing
🧭 Keyword Pioneer — guardrails development
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio