Hessian Matrix formula on PRML

136 Views Asked by At

On the page 316 of PRML(bishop's machine learning book), the Hessian matrix is derived by $A = -\nabla \nabla \psi(a_N)$ where $\psi(a_N)$ is the target value and $a_N$is the variable vector. I think the right answer should be $A = \nabla \nabla \psi(a_N)$? Just as what's on the wiki.

What do I miss here?

1

There are 1 best solutions below

1
On BEST ANSWER

Usually asking a question like this it is a good idea to supply some more information, however you are right that as you have defined it $\mathbf{A} = \nabla \nabla \Psi (\mathbf{a} ) $ is the Hessian obtained by maximising the function $\psi$ and so $\mathbf{A}$ would be negative semi-definite, so that taking the Hessian of the negative of $\psi$ gives us a Hessian which is evaluated at a minima and so positive semi-definite and this is the required local quadratic approximation from which the covariance function of Laplace approximation is constructed.

So in this context $\mathbf{A}$ is to be interpreted as the Hessian of $-\psi$ where the negative is taken to give us a positive semi-definite matrix, and this is in fact what Bishop states when he writes

"the Hessian matrix $\mathbf{A} = - \nabla \nabla \Psi $"

and not the way you have interpreted it which is

"the Hessian matrix of $\psi$ is given by $\mathbf{A} = - \nabla \nabla \Psi$"

Hopefully that clears up the distinction.