Special solutions with small volume exist
Tausifa Jan Saleem ⋅ Ramanjit Ahuja ⋅ Surendra Prasad ⋅ Brejesh Lall
Abstract
The Lottery Ticket Hypothesis for deep neural networks emphasizes the importance of initialization used to re-train the sparser networks obtained using the iterative magnitude pruning process. An explanation for why the specific initialization proposed by the lottery ticket hypothesis tends to work better in terms of generalization (and training) performance has been lacking. In this work, we attempt to provide insight into this phenomenon by empirically studying the volume/geometry and loss landscape characteristics of the solutions obtained at various stages of the iterative magnitude pruning process.
Chat is not available.
Successful Page Load