Poster
in
Workshop: Workshop on Learning from Time Series for Health
Combating Missing Values in Multivariate Time Series by Learning to Embed Each Value as a Token
Chun-Kai Huang · Yi-Hsien Hsieh · Shao-Hua Sun · Tung-Hung Su · JH Kao · Che Lin
Keywords: [ representation learning ] [ imputation-free ] [ multivariate time series data ] [ missing value ]
Irregular and asynchronous sampled multivariate time series (MTS) data is often filled with missing values. Most existing methods embed features according to timestamp, requiring imputing missing values. However, imputed values can drastically differ from real values, resulting in inaccurate predictions made based on imputation. To address the issue, we propose a novel concept, “each value as a token (EVAT),” treating each feature value as an independent token, which allows for bypassing imputing missing values. To realize EVAT, we propose scalable numerical embedding, which learns to embed each feature value by automatically discovering the relationship among features. We integrate the proposed embedding method with the Transformer Encoder, yielding the Scalable nUMerical eMbeddIng Transformer (SUMMIT), which can produce accurate predictions given MTS with missing values. We induct experiments on three distinct electronic health record (EHR) datasets with high missing rates. The experimental results verify SUMMIT's efficacy, as it attains superior performance than other models that need imputation.