Artificial intelligence and machine learning in physics

Universal approximation theorem

The universal approximation theorem plays a central role in deep learning. Cybenko (1989) showed the following:

Let $ \sigma $ be any continuous sigmoidal function such that

$$ \sigma(z) = \left\{\begin{array}{cc} 1 & z\rightarrow \infty\\ 0 & z \rightarrow -\infty \end{array}\right. $$

Given a continuous and deterministic function $ F(\boldsymbol{x}) $ on the unit cube in $ d $-dimensions $ F\in [0,1]^d $, $ x\in [0,1]^d $ and a parameter $ \epsilon >0 $, there is a one-layer (hidden) neural network $ f(\boldsymbol{x};\boldsymbol{\Theta}) $ with $ \boldsymbol{\Theta}=(\boldsymbol{W},\boldsymbol{b}) $ and $ \boldsymbol{W}\in \mathbb{R}^{m\times n} $ and $ \boldsymbol{b}\in \mathbb{R}^{n} $, for which

$$ \vert F(\boldsymbol{x})-f(\boldsymbol{x};\boldsymbol{\Theta})\vert < \epsilon \hspace{0.1cm} \forall \boldsymbol{x}\in[0,1]^d. $$