Skip to yearly menu bar Skip to main content


Optimal learning rate scaling depends on data in deep scalar linear networks

Yedi Zhang ⋅ Peter Latham ⋅ Leena Chennuru Vankadara ⋅ Andrew Saxe

Abstract

Chat is not available.