ICLR Poster Better autoregressive regression with LLMs via regression-aware fine-tuning

Poster

Better autoregressive regression with LLMs via regression-aware fine-tuning

Michal Lukasik · Zhao Meng · Harikrishna Narasimhan · Yin-Wen Chang · Aditya Krishna Menon · Felix Yu · Sanjiv Kumar

Hall 3 + Hall 2B #317

[ Abstract ]

Thu 24 Apr midnight PDT — 2:30 a.m. PDT

Abstract:

Decoder-based large language models (LLMs) have proven highly versatile, with remarkable successes even on problems ostensibly removed from traditional language generation. One such example is solving regression problems, where the targets are real numbers rather than textual tokens. A common approach to use LLMs on such problems is to perform fine-tuning based on the cross-entropy loss, and use autoregressive sampling at inference time. Another approach relies on fine-tuning a separate predictive head with a suitable loss such as squared error. While each approach has had success, there has been limited study on principled ways of using decoder LLMs for regression. In this work, we compare different prior works under a unified view, and introduce regression-aware fine-tuning(RAFT), a novel approach based on the Bayes-optimal decision rule. We demonstrate how RAFT improves over established baselines on several benchmarks and model families.

Live content is unavailable. Log in and register to view live content