Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Navigating and Addressing Data Problems for Foundation Models (DPFM)

Evaluating Large Language Models in an Emerging Domain: A Pilot Study in Decentralized Finance

Joshua Pearlson · Xiaoyuan Liu · Chengsong Huang · Kripa George · Dawn Song · Chenguang Wang

Keywords: [ NER ] [ nlp ] [ DEFI ]


Abstract:

Large Language Models (LLMs) have demonstrated exceptional ability in zero-shot generalization. However, it is challenging to determine if the ability is genuine emergent or is due to extensive exposure to the large volume of relevant data. In this paper, we introduce a novel method of using data from an emerging domain of DeFi to test emergent abilities. Specifically, we collect a DeFi white paper corpus comprising 150 documents and build the first NER dataset in the DeFi domain. With collected timestamps, we show newly introduced entities, those after the knowledge cutoffs, aren’t recognized by LLMs as effectively as entities existing before that date. We also demonstrate that limitations in performance can be mitigated through techniques such as in-context learning or fine-tuning. Our comprehensive experiments using these techniques show consistent performance improvement with different models. Despite the promising results, we conclude that improving the understanding of large language models in emerging domains is an open research question and requires fundamental algorithmic innovations from the community.

Chat is not available.