how to solve the inequality $\lvert w \sigma^{'}{(wa+b)}\rvert \gt 1$?

143 Views Asked by At

when i read the book of neural networks and deep learning at chapter 5 (the URL is,the related content of chapter five )i have encounter a problem to try to resolve the problem .

when we training the deep neural network ,the vanishing gradient problem is often encountered. so we need to adjust the w(weight) parameter to fit the problem,but what's the scope of w.intuitive ,the larger w is ,the problem fit better. but the w is related with the $\sigma'(z) = \sigma'(wa +b)$,when w is larger,the value of $\sigma'(z) = \sigma'(wa +b)$ is smaller( the activity function is sigmod $\sigma(x)=\frac{1}{1+e^{-x}}$ and $\sigma'(x) =\sigma(x)(1-\sigma(x))$ ),and the product $|w \sigma'(wa+b)|$ is changing to bigger or smaller, we do not know so we need to calculate the scope of w,so that make sure with a larger w , the product also become larger ,only with that we can solve the vanishing gradient problem .

the question's content is

Consider the product$ |w \sigma'(wa+b)|$. Suppose $|wσ′(wa+b)|≥1$. (1) Argue that this can only ever occur if $|w|≥4$. (2) Supposing that $|w|≥4$, consider the set of input activations a for which $|wσ′(wa+b)|≥1$. Show that the set of a satisfying that constraint can range over an interval no greater in width than \begin{eqnarray} \frac{2}{|w|} \ln\left( \frac{|w|(1+\sqrt{1-4/|w|})}{2}-1\right). \tag{123}\end{eqnarray} (3) Show numerically that the above expression bounding the width of the range is greatest at $|w|≈6.9$, where it takes a value $≈0.45$. And so even given that everything lines up just perfectly, we still have a fairly narrow range of input activations which can avoid the vanishing gradient problem.

Although it has related the deep learning ,but I know this is only a mathematical problems, the only need to do is solving equations and the inequation . due to my level of math, i can't solve this problem ,can somebody help me to solve the problem . thanks very much.

and i have another problem can the inequation and the equation in this problem solved by MATLAB?