Can it be shown mathematically that $\|Wrelu(W_{d-1}relu(\ldots relu(W_x))\|\leq\|W_{d-1}W_d\ldots W_1x\|$?

59 Views Asked by At

Let $W_1,W_2, \ldots W_d$ be linear transformations. Also, let $relu(x)=\max(0,x)$ be an element wise non linearity. Can we somehow show that \begin{equation} \|Wrelu(W_{d-1}relu(\ldots relu(W_x))\|\leq\|W_{d-1}W_d\ldots W_1x\| \end{equation} here $\|.\|$ is an $\ell_2$ norm.

1

There are 1 best solutions below

0
On BEST ANSWER

I don't think it is true. Let $W_1$ be the identity, $W_2(x_1,x_2) = (x_1+x_2, 0)$. Then $$W_2W_1(-1,1) = W_2(-1,1) = (0,0)$$ while $$W_2(relu(W_1(-1,1)) = W_2(relu(-1,1)) = W_2(0,1) = (1,0).$$