Do Foundation Models Generalize to Real-World EV Fleets? A 1.1M-Drive Benchmark
Abstract
Time series foundation models (TSFMs) promise general-purpose forecasting, yet their effectiveness on large-scale industrial data with physical heterogeneity remains underexplored. We benchmark three zero-shot TSFMs and two supervised models against a Cluster-Aware Mixture of Experts on 1.1 million real-world electric vehicle driving sequences. The Cluster-Aware approach routes inputs to specialized LSTM experts based on Soft Dynamic Time Warping clusters, reducing mean absolute error by 14.7\% over a global baseline and outperforming all other models. Evaluation on an unseen vehicle architecture confirms robust transfer with only 8.1\% error increase. Our results suggest that while TSFMs deliver competitive zero-shot performance, domain-informed supervised specialization remains advantageous on this heterogeneous industrial dataset.