I am having difficulty understanding one of the steps in the proof of Lemma 1 of the Cybenko Universal Approximation Theorem.
Cybenko defines a sigmoidal function as $\sigma:\mathbb{R}\rightarrow\mathbb{R}$ such that
- $\displaystyle\lim_{t\rightarrow\infty}\sigma(t)=1$
- $\displaystyle \lim_{t\rightarrow-\infty}\sigma(t)=0$
He also uses $I_n=[0.1]^n$ and $M(I_n)=\{\mu:\mu\text{ is a regular, finite, signed Borel measure} \}.$ He also includes the definition of a discriminatory function.
Lemma 1. Any bounded, measurable sigmoidal function, $\sigma$, is discriminatory.
The proof proceeds as follows:
Let $x,y\in\mathbb{R}^n$ and $b,\varphi\in \mathbb{R}$. Define for each $\lambda\in\mathbb{R}$: $$\sigma_\lambda(x)=\sigma(\lambda(\langle x,y\rangle+b)+\varphi)$$
Then we have three cases:
Case 1. $\langle x,y\rangle+b=0$. Then,
$\displaystyle\lim_{\lambda\rightarrow\infty} \sigma_\lambda(x)=\sigma(\varphi)$
Case 2. $\langle x,y\rangle+b<0$.
$\displaystyle\lim_{\lambda\rightarrow\infty} \sigma_\lambda(x)=0$
Case 3. $\langle x,y\rangle+b>0$.
$\displaystyle\lim_{\lambda\rightarrow\infty} \sigma_\lambda(x)=1$
Define $\gamma:\mathbb{R}\rightarrow\mathbb{R}$ by
$\gamma(x)=\begin{cases} 0, & \langle x,y\rangle+b<0\\ 1, & \langle x,y\rangle+b>0 & \\ \sigma(\varphi), & \langle x,y\rangle+b=0 \end{cases}$
Then we have $$\lim_{\lambda\rightarrow\infty} \sigma_\lambda(x)=\gamma(x)$$ So the family $\{\sigma_\lambda:\lambda\in\mathbb{R}\}$ converges pointwise to $\gamma$ and each $\sigma_\lambda$ is bounded.
Let $\mu\in M(I_n)$. Then we can apply the Lesbegue Bounded Convergence Theorem to obtain: $$\lim_{\lambda\rightarrow \infty}\int_{I_n} \sigma_\lambda(x)d\mu(x) = \int_{I_n} \gamma(x)d\mu(x)$$
Note that here in Cybenko's paper, he erroneously writes that: $$\int_{I_n} \sigma_\lambda(x)d\mu(x) = \int_{I_n} \gamma(x)d\mu(x)$$ I believe in this case he just left off the limit. Moving past this typo, however, he concludes that $$0=\int_{I_n} \sigma_\lambda(x)d\mu(x)$$
My question is what is the justification for this step? That is, why is the integral equal to 0?
This is by assumption. The definition of $\sigma$ discriminatory is that for every $\mu\in M(I_n)$, $$\left[(\forall y)( \forall b) \int_{I_n} \sigma(\langle x,y\rangle +b)d\mu(x)=0\right]\Rightarrow \mu=0$$ So we assume that $\int_{I_n} \sigma(\langle x,y\rangle +b)d\mu(x)=0$ for each $y\in \mathbb{R^n}$ and $b\in\mathbb{R}$.
Notice that $$\sigma_\lambda(x)=\sigma(\langle x, \lambda y\rangle+(\lambda b+\varphi)).$$ So this is a particular case of the assumption.