ICLR Analytical Solution of Three-layer Network with Matrix Exponential Activation Function

Poster
in
Workshop: Bridging the Gap Between Practice and Theory in Deep Learning

Analytical Solution of Three-layer Network with Matrix Exponential Activation Function

Kuo Gai · Shihua Zhang

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract: It's known that in practice deeper networks tends to be more powerful than shallow one, but this has not been understood theoretically. In this paper, we find the analytical solution of a three-layer network with matrix exponential activation function, i.e., $$f(X)=W_3\exp(W_2\exp(W_1X)), X\in \mathbb{C}^{d\times d}$$have analytical solutions for the equations$$\begin{cases}Y_1=f(X_1) \\\\Y_2=f(X_2) \end{cases}$$for $X_1,X_2,Y_1,Y_2$ with only invertible assumptions. Our proof shows the power of depth and the use of non-linear activation function, since one layer network can only solve one equation,i.e.,$Y=WX$.

Chat is not available.

Poster in Workshop: Bridging the Gap Between Practice and Theory in Deep Learning

Analytical Solution of Three-layer Network with Matrix Exponential Activation Function

Kuo Gai · Shihua Zhang

Poster
in
Workshop: Bridging the Gap Between Practice and Theory in Deep Learning