ICLR Poster Minimum width for universal approximation using ReLU networks on compact domain

Poster

Minimum width for universal approximation using ReLU networks on compact domain

Namjun Kim · Chanho Min · Sejun Park

Halle B #229

[ Abstract ]

[ Poster] [ OpenReview]

Abstract: It has been shown that deep neural networks of a large enough width are universal approximators but they are not if the width is too small.There were several attempts to characterize the minimum width

w_{min}

$w_{\min}$ enabling the universal approximation property; however, only a few of them found the exact values.In this work, we show that the minimum width for

L^{p}

$L^p$ approximation of

L^{p}

$L^p$ functions from

[0, 1]^{d_{x}}

$[0,1]^{d_x}$ to

R^{d_{y}}

$\mathbb R^{d_y}$ is exactly

max d_{x}, d_{y}, 2

$\max\\{d_x,d_y,2\\}$ if an activation function is ReLU-Like (e.g., ReLU, GELU, Softplus).Compared to the known result for ReLU networks,

w_{min} = max d_{x} + 1, d_{y}

$w_{\min}=\max\\{d_x+1,d_y\\}$ when the domain is

R^{d_{x}}

${\mathbb R^{d_x}}$ , our result first shows that approximation on a compact domain requires smaller width than on

R^{d_{x}}

${\mathbb R^{d_x}}$ .We next prove a lower bound on

w_{min}

$w_{\min}$ for uniform approximation using general activation functions including ReLU:

w_{min} \geq d_{y} + 1

$w_{\min}\ge d_y+1$ if $d_x

Chat is not available.