Principled Weight Initialization for Hypernetworks

Oscar Chang; Lampros Flokas; Hod Lipson

Principled Weight Initialization for Hypernetworks

Oscar Chang, Lampros Flokas, Hod Lipson

Keywords: hypernetworks, meta learning, optimization

Abstract Paper Reviews

Thursday: Optimisation II

Abstract: Hypernetworks are meta neural networks that generate weights for a main neural network in an end-to-end differentiable manner. Despite extensive applications ranging from multi-task learning to Bayesian deep learning, the problem of optimizing hypernetworks has not been studied to date. We observe that classical weight initialization methods like Glorot & Bengio (2010) and He et al. (2015), when applied directly on a hypernet, fail to produce weights for the mainnet in the correct scale. We develop principled techniques for weight initialization in hypernets, and show that they lead to more stable mainnet weights, lower training loss, and faster convergence.

Principled Weight Initialization for Hypernetworks

Oscar Chang, Lampros Flokas, Hod Lipson

Similar Papers

FSNet: Compression of Deep Convolutional Neural Networks by Filter Summary

Yingzhen Yang, Jiahui Yu, Nebojsa Jojic, Jun Huan, Thomas S. Huang,

Spike-based causal inference for weight alignment

Jordan Guerguiev, Konrad Kording, Blake Richards,

Picking Winning Tickets Before Training by Preserving Gradient Flow

Chaoqi Wang, Guodong Zhang, Roger Grosse,