I read that the direction of gradient descent when optimising a function $f(w)$ after several iterations will be the same as the direction of the eigenvector corresponding to the smallest eigenvalue of the Hessian $H$. How can we prove this is the case?
Edit: For this to apply there is an assumption that the smallest eigenvalue is sufficiently smaller than the second smallest.