2025 ACL ACL 2025

SQLGenie: A Practical LLM based System for Reliable and Efficient SQL Generation

Abstract

AbstractLarge Language Models (LLMs) enable natural language to SQL conversion, allowing users to query databases without SQL expertise. However, generating accurate, efficient queries is challenging due to ambiguous intent, domain knowledge requirements, and database constraints. Extensive reasoning improves SQL quality but increases computational costs and latency. We propose SQLGenie, a practical system for reliable SQL generation. It consists of three components: (1) Table Onboarder, which analyzes new tables, optimizes indexing, partitions data, identifies foreign key relationships, and stores schema details for SQL generation; (2) SQL Generator, an LLM-based system producing accurate SQL; and (3) Feedback Augmentation, which filters correct query-SQL pairs, leverages multiple LLM agents for complex SQL, and stores verified examples. SQLGenie achieves state-of-the-art performance on public benchmarks (92.8% execution accuracy on WikiSQL, 82.1% of Spider, 73.8% on BIRD) and internal datasets, surpassing the best single-LLM baseline by 21.5% and the strongest pipeline competitor by 5.3%. Its hybrid variant optimally balances accuracy and efficiency, reducing generation time by 64% compared to traditional multi-LLM approaches while maintaining competitive accuracy.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Science and Machine Learning and Natural Language Processing
🧭 Keyword Pioneer — table onboarder
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio