Outer product approximation of Hessian for least squares

537 Views Asked by At

On p251 of Bishop's machine learning book, the Hessian for least squares is derived (as a preliminary step to the outer product approximation):

$ E = \frac{1}{2} \sum_{n=1}^{N} (y_n - t_n)^2$

$H = \nabla \nabla E = \sum_{n=1}^{N} \nabla y_n (\nabla y_n)^T + \sum_{n=1}^{N} (y_n - t_n) \nabla \nabla y_n $

Firstly, why is the Hessian not given by $\nabla \nabla ^T E$?

Secondly, could someone please explain how the full expression for the Hessian is obtained?