No Spurious Local Minima in a Two Hidden Unit ReLU Network

Workshop

No Spurious Local Minima in a Two Hidden Unit ReLU Network

Chenwei Wu · jiajun luo · Jason D Lee

East Meeting Level 8 + 15 #20

Wed 2 May, 11 a.m. PDT

[ Abstract ]

[ PDF]

Deep learning models can be efficiently optimized via stochastic gradient descent, but there is little theoretical evidence to support this. A key question in optimization is to understand when the optimization landscape of a neural network is amenable to gradient-based optimization. We focus on a simple neural network two-layer ReLU network with two hidden units, and show that all local minimizers are global. This combined with recent work of Lee et al. (2017); Lee et al. (2016) show that gradient descent converges to the global minimizer.

Live content is unavailable. Log in and register to view live content