On the Use of Schrödinger Bridges for Tabular Data Generation
Abstract
Schrödinger bridges (SBs) provide a principled framework for generative modeling based on entropy-regularized optimal transport and stochastic control, yet their applicability to tabular data generation remains largely unexplored. In this work, we present a systematic evaluation of modern Schrödinger-bridge–based solvers for tabular data synthesis and compare them with state-of-the-art GAN- and diffusion-based models under a unified, leakage-free experimental protocol. Using eight continuous tabular datasets from OpenML, independent hyperparameter tuning, cross-validation, and a diverse set of distributional and utility-oriented metrics, we assess both fidelity to the real data distribution and downstream predictive performance in a Train-on-Synthetic–Test-on-Real setting. Our results show that SB-based models, particularly Diffusion Schrödinger Bridge Matching (DSBM), achieve competitive performance on dependency-sensitive distributional metrics and exhibit stable training behavior, while preserving reasonable downstream utility. Although diffusion-based baselines remain strong when predictive accuracy is the primary objective, Schrödinger bridges emerge as a complementary and structurally robust alternative for tabular data generation, offering explicit control over global geometry and dependency structure. These findings clarify the role of transport-based generative modeling in the tabular domain and highlight promising directions for future research on mixed-type data and task-aware objectives.