I am trying to really understand why the gradient of a function gives the direction of steepest ascent intuitively.
Assuming that the function is differentiable at the point in question,
a) I had a look at a few resources online and also looked at this
Why is gradient the direction of steepest ascent? , a popular question on this stackexchange site.
The accepted answer basically says that we multiply the gradient with an arbitrary vector and then say that the product is maximum when the vector points in the same direction as the gradient? This to me really does not answer the question, but it has 31 upvotes so can someone please point out what I am obviously missing?
b) Does the gradient of a function tell us a way to reach the maxima or minima? if yes, then how and which one - maxima or minima or both?
Edit: I read the gradient descent algorithm and that answers this part of my question.
c) Since gradient is a feature of the function at some particular point - am I right in assuming that it can only point to the local maxima or minima?
I had first learned it as if $f(x,y,z) = k$ is a surface $\nabla f$ is a vector perpendicular to the surface.
i.e. the plane tangent to the surface at $\mathbf x = (x_1,y_1,z_1)$ is$\frac {\partial f}{\partial x}(\mathbf x) (x-x_1) + \frac {\partial f}{\partial y}(\mathbf x) (y-y_1) + \frac {\partial f}{\partial z}(\mathbf x)(z - z_1) = 0$
And $(\frac{\partial f}{\partial x}(\mathbf x), \frac{\partial f}{\partial y}(\mathbf x),\frac {\partial f}{\partial z}(\mathbf x))$ is normal to the plane.
$\nabla f$ is a vector perpendicular to the surface when $k$ is fixed. Now we allow $k$ some freedom, and we want to move in the direction of greatest change. Whatever direction we go has a component perpendicular to the surface, and a component parallel to the surface. If we move parallel to the surface we are not contributing to a change in $k.$ The direction of maximal change is $100%$ perpendicular to the surface.
If that intuition isn't working for you. The we are back to the answer you found less than satisfying.
$\frac {\partial f}{\partial x}$ is the change in $f$ for a change in $x.$
For any unit vector $u,$ $\nabla f \cdot u$ would be the change in $f$ for a change in direction $u.$
And we want find $u$ that maximizes $\nabla f \cdot u = \|\nabla f\| cos\theta$
Which will be maximal when $\theta = 0$, or when $u$ points in the same direction as $\nabla f$
Does $\nabla f$ tell us the direction of steepest decent, too? It certainly does. Straight in the opposite direction.
$\nabla f$ does not necessarily point directly toward the local maxima or minima. It points in the direction of greatest change. If you imagine yourself climbing a hill. Straight up the hill is not necessarily the direction the peak of the mountain. You may get up the steep part and then be making a turn.