Gradient checking in neural network with dot product

106 Views Asked by At

I was taking the 2nd course of deeplearning.ai specialization on coursera. I was watching a video on gradient checking for neural networks. After we compute the gradient vector and the approximated gradient vector as shown here, why is the strange formula $$difference = \frac {\| grad - gradapprox \|_2}{\| grad \|_2 + \| gradapprox \|_2 } \tag{3}$$ being used to calculate the similarity i.e. of two vectors. Why not use a cosine similarity?

2

There are 2 best solutions below

2
On BEST ANSWER

Probably to avoid division-by-zero errors. As we approach a point where the gradient is zero, the last step may have a vector's length round to $0$. That's not a problem here as long as only one does. You can of course write the formula in terms of the usual cosine similarity (I'll leave that as an exercise). It's also natural to subtract one vector from the other elsewhere in gradient descent, so you can recycle a cached value.

0
On

The idea is you want to know when the update in small, so that you can stop the iterations. The problem is: what does it mean to be small.

One option is to calculate the distance $|{\bf a} - {\bf b} |_2$ and then compare it against $0$, or a very small number $\epsilon$. If it is close to zero then stop. But here is the problem: imagine you multiple the cost function by a factor $k$ (arbitrary, e.g. the size of the problem, or 1/2, ...) then each vector is now scaled by the same factor

$$ | k {\bf a} - k {\bf b} |_2 = k |{\bf a} - {\bf b} |_2 $$

For the example, imagine $k = 10^3$, so now what is the value you should compare against to stop? If you don't change $\epsilon$ the algorithm is now going to stop even if its not converged.

To avoid this problem, divide by the length of the vectors

$$ \frac{| k {\bf a} - k {\bf b} |_2}{k|{\bf a}|_2 + k|{\bf b}|_2} = \frac{k|{\bf a} - {\bf b} |_2}{2k} = \frac{|{\bf a} - {\bf b} |_2}{2} $$

which clearly does not depend on the scale of the problem. And $\epsilon$ now is meaningful