Information-Theoretic Unsupervised Embedding Quality Evaluation
Mikhail Kuznetsov ⋅ Ivan Butakov ⋅ Marina Munkhoeva ⋅ Alexey Frolov ⋅ Ivan Oseledets
Abstract
This study revisits existing unsupervised measures of embedding quality and introduces new metrics rooted in Information Theory. We establish that classical spectral metrics such as rank, effective rank, and NESum form a unified family of Rényi entropies. An extensive evaluation of both existing and new approaches reveals that most failures in the generalization of SSRL models can be explained via linear deficiencies of the embeddings, rather than by more intricate metrics like clustering or entropy. On the other hand, non-linear metrics proved useful for quantifying model alignment.
Chat is not available.
Successful Page Load