Intuition about gradient

728 Views Asked by At

https://en.wikipedia.org/wiki/Gradient

Gradient is a vector which we can obtain from any differentable function taking its partial derivatives.

From Wiki: "...the gradient points in the direction of the greatest rate of increase of the function"

I cant understand why this vector points in direction of greatest increase of a function! From definition of partial derivative it is a limit of ratio increment of function to increment of argument. Increment of argument, by definition, should tend to zero. So argument should be positive? But for some function increasing the argument we can receive lesser value of function

For example lets consider function

y = -x^2

Gradient of this function is

-2x or (1, -2x)?

How I can find direction of increase of this function in point (-1;-1)?

Feeling lost. Sorry for mess.

2

There are 2 best solutions below

3
On BEST ANSWER

Let's look at your one dimensional case. The gradient of the function $f\left(x\right) = -x^2$ is the one-dimensional vector $-2x$, which is itself a function of x. Say x is positive, then the gradient is negative, because the function $-x^2$ does indeed increase as x gets less positive. If x is negative, the gradient is positive, because the function does indeed increase as x gets less negative (it's maxiumum is at 0). The key here is that the gradient is itself a function of x, and it will be negative on the regions where the function is decreasing (since it will be increasing in the other direction).

The same goes for multiple dimensions. Consider, for example:

$f\left(x,y\right)=x^2 - y^3$

then the gradient of f would be the vector

$\left(\begin{array}{c} \frac{\partial z}{\partial x}\\\frac{\partial z}{\partial y}\end{array}\right)=\left(\begin{array}{c} 2x\\-3y^2\end{array}\right)$

Which would indeed still point in the xy-direction of steepest ascent.

0
On

Let's try to prove, that the gradient indeed points in the direction of the steepest rate of increase at a point $x_0$.

Assume $f(x)\in\mathbb{R}$ is differentiable function at $x_0$, where $x \in \mathbb{R}^n$. Now assume, that we want to compute the rate of increase with respect to a particular direction $v \in \mathbb{R}^n$ at $x_0$, where $\left|v\right| = 1$ (The vectors points in the direction we are trying to compute the rate of increase and has length one, because we only care about the direction and not the length of the vector).

Now, the values the function is taking along that direction $v$ around $x_0$ can be described by the function $f(vt + x_0)$, for $t$ in some interval $(-\epsilon,\epsilon), \epsilon > 0$. Now in order to find out the rate of increase $r_v$ at $x_0$ along the positive direction of $v$, we just have to differentiate $f(vt+x_0)$ at $t=0$, thus $r_v=\frac{d}{dt} f(vt+x_0)|_{t=0}$.

Computing $r_v$ gives us: $r_v = \nabla f^T(x_0)v$ (Using chain-rule and convention that $\nabla f$ is a column vector).

Ok, now we have computed the generic rate of increase $r_v$ at $x_0$, with respect to every direction $v$ in $\mathrm{R}^n$. The original question was now, to find the direction $v$, such that $r_v$ is maximized.

Now, notice that the maximum value for $r_v$ will be obtained exactly then, if $v$ points in the same direction as $\nabla f(x_0)$, thus $v_{opt} = \frac{\nabla f(x_0)}{\left|\nabla f(x_0)\right|}$. Hence, $\nabla f(x_0)$ is indeed pointing always in the direction of the steepest increase.

Mathematicall speaking the optimum value for v follows from: $r_v = \nabla f^T(x_0)v \leq |\nabla f(x_0)||v| = |\nabla f(x_0)|$, since $|v|=1$ and this upperbound can be achieved by $v_{opt} = \frac{\nabla f(x_0)}{\left|\nabla f(x_0)\right|}$.