Poster
in
Workshop: 3rd Workshop on Navigating and Addressing Data Problems For Foundation Models (DATA-FM)

[Short] A Formal Language Benchmark for LLMs

Bishwamittra Ghosh ⋅ Krishna Gummadi ⋅ Evimaria Terzi

Project Page [ OpenReview]

Abstract

Empirical research has guided the progress of large language models (LLMs) over the years, where we often have a limited understanding of the underlying data fed to them. We take an orthogonal approach to the problem, and propose a formal language benchmark for studying LLMs. We ask the following questions: (a) Why do we need formal language as a test bed to study LLMs?, and (b) How do we measure the language proficiency of an LLM? As contributions, we highlight the preciseness and control of probabilistic formal languages, which are well-suited for studying LLMs. Moreover, we make a contrast between a generative test and a discriminative test in determining the language proficiency of an LLM, where the latter is comparable across LLMs.

Chat is not available.