Checking if the Hessian is the derivative of the gradient

466 Views Asked by At

Suppose $f: \Bbb R^n \to \Bbb R$. I have a code that computes the gradient of $f$. I have another code that computes the Hessian of $f$ times a vector. Now I want to check if they are correct. Specifically, I am checking of the Hessian times vector is the directional derivative of the gradient.

I have

Hessian(x).v \approx (1/h) * (grad(x+h.v) - grad(x)) where h is a scalar.

Define:

fderror:= (Hessian.v - (1/h) * ( grad(x+h.v) - grad(x))))^T (Hessian.v - (1/h) * ( grad(x+h.v) - grad(x)))).

For some reasons, as $h$ decreases, my $\text{fderror}$ increases and vice versa. And when $h$ is around 1e-10, my $\text{fderror}$ is around 1e+00.

Is it due to floating point or my Hessian-times-vector is not correct?

I would greatly appreciate your help.

1

There are 1 best solutions below

0
On

Well $1e-10$ is both small enough and large enough that it could either be rounding error, or a problem with your code. It's hard to say without further information.

  1. A good rule of thumb is to check to somewhere a bit less than the square root of machine epsilon. I like to go with $1e-6$ or $1e-7$.

  2. Run your code with a sequence of progressively smaller finite difference sizes $h$ and make a plot of error vs. $h$ on a logarithmic scale. If everything is correct you should see a straight line going down until machine precision where it starts to get a little weird. Here is a plot of what it should look like if your code is correct: finite difference error plot

  3. It's very common to use approximations to the Hessian (eg., Gauss-Newton approximation), in which case one would expect to have error that does not go away as you make $h$ smaller.

For reference, the Matlab code to generate the above plot is,

x0 = randn;
dx = randn;
hh = 10.^(-(1:0.5:15)); 
diff = (cos(x0 + hh*dx)-cos(x0))./hh; 
true = -sin(x0)*dx; 
err = abs(true - diff); 
plot(-log(hh),log(err))