Successful Page Load
Large language models are increasingly used to translate natural language into SQL. But how reliable are they in real-world settings? In this session, We present a focused evaluation framework for measuring NL-to-SQL performance, including execution correctness, robustness, and query efficiency under varying levels of database context. We’ll discuss how structured QA and validation approaches can help move LLM systems from benchmark success to production reliability. We invite researchers and practitioners working on NL-to-SQL systems, LLM evaluation, and database applications to participate in the discussion and share perspectives from real-world deployments.
Log in and register to view live content
| ICLR uses cookies for essential functions only. We do not sell your personal information. Our Privacy Policy » |