Cybenko Universal Approximation Theorem Lemma 1

219 Views Asked by At

I am having difficulty understanding one of the steps in the proof of Lemma 1 of the Cybenko Universal Approximation Theorem.

Cybenko defines a sigmoidal function as $\sigma:\mathbb{R}\rightarrow\mathbb{R}$ such that

  1. $\displaystyle\lim_{t\rightarrow\infty}\sigma(t)=1$
  2. $\displaystyle \lim_{t\rightarrow-\infty}\sigma(t)=0$

He also uses $I_n=[0.1]^n$ and $M(I_n)=\{\mu:\mu\text{ is a regular, finite, signed Borel measure} \}.$ He also includes the definition of a discriminatory function.

Lemma 1. Any bounded, measurable sigmoidal function, $\sigma$, is discriminatory.

The proof proceeds as follows:

Let $x,y\in\mathbb{R}^n$ and $b,\varphi\in \mathbb{R}$. Define for each $\lambda\in\mathbb{R}$: $$\sigma_\lambda(x)=\sigma(\lambda(\langle x,y\rangle+b)+\varphi)$$

Then we have three cases:

Case 1. $\langle x,y\rangle+b=0$. Then,

$\displaystyle\lim_{\lambda\rightarrow\infty} \sigma_\lambda(x)=\sigma(\varphi)$

Case 2. $\langle x,y\rangle+b<0$.

$\displaystyle\lim_{\lambda\rightarrow\infty} \sigma_\lambda(x)=0$

Case 3. $\langle x,y\rangle+b>0$.

$\displaystyle\lim_{\lambda\rightarrow\infty} \sigma_\lambda(x)=1$

Define $\gamma:\mathbb{R}\rightarrow\mathbb{R}$ by

$\gamma(x)=\begin{cases} 0, & \langle x,y\rangle+b<0\\ 1, & \langle x,y\rangle+b>0 & \\ \sigma(\varphi), & \langle x,y\rangle+b=0 \end{cases}$

Then we have $$\lim_{\lambda\rightarrow\infty} \sigma_\lambda(x)=\gamma(x)$$ So the family $\{\sigma_\lambda:\lambda\in\mathbb{R}\}$ converges pointwise to $\gamma$ and each $\sigma_\lambda$ is bounded.

Let $\mu\in M(I_n)$. Then we can apply the Lesbegue Bounded Convergence Theorem to obtain: $$\lim_{\lambda\rightarrow \infty}\int_{I_n} \sigma_\lambda(x)d\mu(x) = \int_{I_n} \gamma(x)d\mu(x)$$

Note that here in Cybenko's paper, he erroneously writes that: $$\int_{I_n} \sigma_\lambda(x)d\mu(x) = \int_{I_n} \gamma(x)d\mu(x)$$ I believe in this case he just left off the limit. Moving past this typo, however, he concludes that $$0=\int_{I_n} \sigma_\lambda(x)d\mu(x)$$

My question is what is the justification for this step? That is, why is the integral equal to 0?

1

There are 1 best solutions below

0
On BEST ANSWER

This is by assumption. The definition of $\sigma$ discriminatory is that for every $\mu\in M(I_n)$, $$\left[(\forall y)( \forall b) \int_{I_n} \sigma(\langle x,y\rangle +b)d\mu(x)=0\right]\Rightarrow \mu=0$$ So we assume that $\int_{I_n} \sigma(\langle x,y\rangle +b)d\mu(x)=0$ for each $y\in \mathbb{R^n}$ and $b\in\mathbb{R}$.

Notice that $$\sigma_\lambda(x)=\sigma(\langle x, \lambda y\rangle+(\lambda b+\varphi)).$$ So this is a particular case of the assumption.