Solving Backpropagation

46 Views Asked by At

Example to caclulate

How would I calculate the gradient here? I'm stuck for a few days now and any advice would be helpfull :)

With $tanh$ as activation function and MSE as loss function. $$ \frac{\partial{J(\hat{y},y)}}{\partial{w^{[2]}_{1,2}}} = \frac{\partial{J(\hat{y},y)}}{\partial{a^3}} \frac{\partial{a^3}}{\partial z^{[3]}} \frac{\partial{z^{[3]}}}{\partial{a^{[2]}_1}} \frac{\partial{a^{[2]}_1}}{z^{[2]}_1} \frac{z^{[2]}_1}{w^{[2]}_{1,2}} $$ Is that correct ?

$m$ is the number of training examples

$n^{[L]}$ is the number of output variables / neurons in the last layer

$a^{[3]}$ is the output/prediction vector $\hat y$ $$ J(\hat{y},y) = J(a^{[3]},y) = \frac{1}{m} \sum_{i=1}^{m} \frac{1}{n^{[L]}} \sum_{j=1}^{n^{[L]}} (a^{[3]}_j-y^{[3]}_j)^2 $$

$$ \frac {\partial J(a^{[3]},y)} {\partial {a^{[3]}}} = ?? $$

1

There are 1 best solutions below

0
On

You got it almost right. but you missed a little point. a and z are not numbers or variables. they are matrices. so when computing the partial derivative, you should compute it as a vector. so we have:

$$ \frac{\partial J(a^{[3]}, y)}{\partial a^{[3]}} = \begin{bmatrix} \frac{\partial J(a^{[3]}, y)}{\partial a_1^{[3]}} & \frac{\partial J(a^{[3]}, y)}{\partial a_2^{[3]}} & \frac{\partial J(a^{[3]}, y)}{\partial a_3^{[3]}} \end{bmatrix} $$

and we can compute:

$$ \begin{align} \frac{\partial J(a^{[3]}, y)}{\partial a_k^{[3]}} &= \frac{1}{m} \sum_{i=1}^{m} \frac{1}{n^{[L]}} 2(a_k^{[3]} - y_k^{[3]})\\ &= \frac{2}{m n^{[L]}} \sum_{i=1}^{m} (a_k^{[3]} - y_k^{[3]}) \end{align} $$