Skip to yearly menu bar Skip to main content


Poster
in
Workshop: The 3rd DL4C Workshop: Emergent Possibilities and Challenges in Deep Learning for Code

Optimizing Small Language Models for NL2SQL

Wenqi Pei · Xu Hailing · Henry Zhao · CHEN HAN · zining zhang · Shizheng Hou · Luo Pingyi · Bingsheng He


Abstract:

Natural Language to SQL conversion (NL2SQL) has advanced significantly with large language models (LLMs), yet these models often rely on closed-source systems and high computational resources, raising data privacy and deployment concerns. In contrast, small language models (SLMs) face challenges in NL2SQL tasks, achieving poor performance, and are incompatible with existing frameworks. To address these issues, we introduce Feather-SQL, a new lightweight framework specifically designed for SLMs. Feather-SQL leverages schema pruning and linking to enhance column-question alignment, and a multi-path and multi-candidate generation to boost SQL accuracy and executability. Additionally, we introduce the 1+1 Model Collaboration Paradigm pairs a general-purpose chat model for auxiliary reasoning tasks with a fine-tuned SQL specialist for SQL generation, combining broad reasoning with domain-specific precision. Experimental results on BIRD demonstrate that Feather-SQL improves NL2SQL performance on SLMs, with around 10\% boost for non-fine-tuned models. The proposed paradigm raises the accuracy ceiling of SLMs to 54.76\%, highlighting its effectiveness.

Chat is not available.