Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Setting up ML Evaluation Standards to Accelerate Progress

Rethinking Streaming Machine Learning Evaluation

Shreya Shankar · Bernease Herman · Aditya Parameswaran


Abstract:

While most work on evaluating machine learning (ML) models focuses on batches of data, computing the same metrics in a streaming setting (i.e., unbounded, timestamp-ordered datasets) fails to accurately identify when models are performing unexpectedly. In this position paper, we discuss how sliding windows--that ML metrics are evaluated over--can be negatively affected by real-world phenomena (e.g., delayed arrival of labels) and propose additional metrics to assess streaming ML performance.

Chat is not available.