Poster
in
Workshop: 1st ICLR Workshop on Time Series in the Age of Large Models

Giving Sensors a Voice: Multimodal JEPA for Semantic Time-Series Embeddings

Pastrana ⋅ Sina Pakazad ⋅ Utsav Dutta ⋅ Henrik Ohlsson

Project Page [ OpenReview]

Abstract

We introduce CHARM (Channel-Aware Representation Model), a multimodal architecture for self-supervised time series representation learning that incorporates channel-level textual descriptions into both temporal convolutional and attention layers. This enables the model to reason about sensor identity and inter-channel relationships while remaining invariant to channel ordering. Trained with a Joint Embedding Predictive Architecture (JEPA), CHARM learns temporally stable, noise-robust embeddings by predicting in latent space rather than reconstructing raw signals. Across classification, forecasting, and anomaly detection benchmarks, CHARM's frozen embeddings with a lightweight linear probe match or outperform significantly larger task-specific foundation models.

Chat is not available.