Hessian-vector products

3.3k Views Asked by At

Can someone explain why this is true?

$$g(x + \Delta x) = g(x) + H(x) \Delta x$$

where g is the gradient of function f(x) with respect to x, and H is the hessian of f(x) with respect to x.

I would really appreciate a detailed derivation because I don't understand what it means to take the gradient/derivative of $\Delta x$ which represents the change in x.

Source towards the top here: https://justindomke.wordpress.com/2009/01/17/hessian-vector-products/

1

There are 1 best solutions below

0
On BEST ANSWER

For a scalar function f,

$f(x+dx) \approx f(x)+ \frac{d f(x)}{dx}* dx$

Vector case [$x$ is (n,1) dimensional vector]:

$g(x + \Delta x) \approx g(x) + g(\Delta x)$

$g(x) = \nabla f(x) = \begin{bmatrix} g_1{(x)} \\ g_2{(x)} \\ \vdots \\ g_n{(x)} \end{bmatrix}$ $\mid g_i{(x)} = \frac{\partial f(x)}{dx_i}$

$H(x) = \nabla g(x) = \begin{bmatrix} \nabla g_1{(x)}^T \\ \nabla g_2{(x)}^T \\ \vdots \\ \nabla g_n{(x)}^T \end{bmatrix}$

$g(x) +g(\Delta x) = g(x) + \Delta g(x)\approx g(x)+ \begin{bmatrix} \nabla g_1{(x)}^T \Delta x \\ \nabla g_2{(x)}^T \Delta x\\ \vdots \\ \nabla g_n{(x)}^T \Delta x \end{bmatrix}$

Translation: $g(\Delta x)$ represents g with a small change in x as input, which is equivalent to $\Delta g(x)$ [the infinitesimal change in g(x)].

If you look at the last column vector, you will see that each row $\nabla g_i(x)^T \Delta x$ represents the (rate of change of $g_i(x)$/rate of change of x) dotted with (infinitesimal change in x) := infinitesimal change in $g_i(x)$. In aggregate, the column vector represents the infinitesimal change in g(x) := $\Delta g(x)$.

See also: http://www.friesian.com/calculus.htm