Virtual presentation / poster accept
Transfer Learning with Deep Tabular Models
Roman Levin · Valeriia Cherepanova · Avi Schwarzschild · Arpit Bansal · C. Bruss · Tom Goldstein · Andrew Wilson · Micah Goldblum
Keywords: [ tabular models ] [ transfer learning ] [ tabular data ] [ representation learning ] [ Deep Learning and representational learning ]
Recent work on deep learning for tabular data demonstrates the strong performance of deep tabular models, often bridging the gap between gradient boosted decision trees and neural networks. Accuracy aside, a major advantage of neural models is that they are easily fine-tuned in new domains and learn reusable features. This property is often exploited in computer vision and natural language applications, where transfer learning is indispensable when task-specific training data is scarce. In this work, we explore the benefits that representation learning provides for knowledge transfer in the tabular domain. We conduct experiments in a realistic medical diagnosis test bed with limited amounts of downstream data and find that transfer learning with deep tabular models provides a definitive advantage over gradient boosted decision tree methods. We further compare the supervised and self-supervised pretraining strategies and provide practical advice on transfer learning with tabular models. Finally, we propose a pseudo-feature method for cases where the upstream and downstream feature sets differ, a tabular-specific problem widespread in real-world applications.