Complementing Domain Labels with WaveEnergy Signatures for Time Series Heterogeneity
Abstract
Time series data are ubiquitous across real-world applications, yet they are highly heterogeneous in temporal dynamics, data formats, and acquisition sources, posing a major challenge for pre-training large-scale time series foundation models (TSFMs). Existing TSFMs often exploit human-assigned domain labels, which are dataset-level annotations derived from external metadata, as implicit or explicit supervision to learn domain-specific representations. However, because a single domain label is shared by all windows within a dataset, it provides a coarse and potentially ambiguous signal that fails to reflect window-level heterogeneity, where similar local dynamics can appear across domains and diverse dynamics can coexist within the same domain. This window-level heterogeneity makes domain-label supervision under-specified for learning dynamics-aware representations in TSFMs. To address this limitation, we introduce WaveEnergy, which uses wavelet decomposition to represent time series windows as multi-scale components and derives a dynamics-aware signature from the energy of coefficients that complements domain labels. Experimentally, WaveEnergy provides a stronger alignment with TSFM embeddings than domain labels and enables finer-grained characterization of datasets by capturing within-domain differences in temporal dynamics.