[Short] A Formal Language Benchmark for LLMs
Bishwamittra Ghosh ⋅ Krishna Gummadi ⋅ Evimaria Terzi
Abstract
Empirical research has guided the progress of large language models (LLMs) over the years, where we often have a limited understanding of the underlying data fed to them. We take an orthogonal approach to the problem, and propose a formal language benchmark for studying LLMs. We ask the following questions: (a) Why do we need formal language as a test bed to study LLMs?, and (b) How do we measure the language proficiency of an LLM? As contributions, we highlight the preciseness and control of probabilistic formal languages, which are well-suited for studying LLMs. Moreover, we make a contrast between a generative test and a discriminative test in determining the language proficiency of an LLM, where the latter is comparable across LLMs.
Chat is not available.
Successful Page Load