compute dz[1] applying chain rule

18 Views Asked by At

In Andrew Ng course, we need to compute dz1 in a NN with 2 hidden layers.

So why we don't compute da[2] in order to properly do the chain rule/back-propagation?

why this is correct ?

$\frac{(dL)}{(dz[1])} = \frac{(dL)}{(dz[2])} * \frac{(dz[2])}{(da[1])} * \frac{(da[1])}{(dz[1])} $

and this is not?

$\frac{(dL)}{(dz[1])} = \frac{(dL)}{(da[2])} * \frac{(da[2])}{(dz[2])} * \frac{(dz[2])}{(da[1])} * \frac{(da[1])}{(dz[1])} $

graph of NN