Poster
in
Workshop: Workshop on Logical Reasoning of Large Language Models

LLATAS: Large LAnguage models as Tabular Auxiliary feature Synthesizer

Yuzhen Mao ⋅ Martin Ester

Project Page [ OpenReview]

Abstract

While classical models like Gradient Boosting remain state-of-the-art for tabular data, their performance is often bottlenecked by the limitations of heuristic feature engineering. To address this, we introduce LLATAS, a framework that leverages Large Language Models (LLMs) to synthesize semantic reasoning traces as auxiliary features. Grounded in the Learning Using Privileged Information (LUPI) paradigm, we use these generated signals to train a teacher model, which then guides a lightweight student model operating solely on original inputs. This distillation process allows the student to inherit complex reasoning capabilities without incurring the computational cost of LLMs at inference. Empirical evaluations on disease prediction tasks demonstrate that LLATAS significantly outperforms baselines, reducing test error rates by 17.6% for XGBoost and 22.0% for MLP models.

Chat is not available.