2026 EACL EACL 2026

LitE-SQL: A Lightweight and Efficient Text-to-SQL Framework with Vector-based Schema Linking and Execution-Guided Self-Correction

Abstract

AbstractThe Text-to-SQL task translates natural language questions into SQL queries, enabling intuitive database interaction for non-experts. While recent methods leveraging Large Language Models (LLMs) achieve strong performance, their reliance on proprietary models raises concerns about deployment feasibility and data privacy. In this work, we introduce LitE-SQL, a Lightweight and Efficient framework with two components: (i) a Schema Retriever that performs efficient schema linking using a vector database of pre-computed schema embeddings, optimized with a hard-negative supervised contrastive objective to distinguish semantically similar but functionally irrelevant columns, and (ii) a SQL Generator fine-tuned in two stages—supervised fine-tuning followed by execution-guided reinforcement—enabling execution-guided self-correction without multi-candidate sampling, which is commonly required by prior LLM-based approaches.On BIRD, LitE-SQL achieves 72.10% execution accuracy, and on Spider 1.0 it reaches 88.45%, demonstrating comparable or superior performance to LLM-based methods despite using 2× to 30× fewer parameters. Our findings demonstrate that high-quality Text-to-SQL generation is feasible with lightweight models, offering a practical solution for privacy-sensitive and resource-constrained settings.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio