How does $\left[\nabla f\left(\mathbf{x}\right)\right]^\mathrm{T} \nabla f\left(\mathbf{x}\right)$ approximate $\mathbf{H}$?

57 Views Asked by At

Page 3 of a guide to Levenberg-Marquardt optimization says that $\left[\nabla f\left(\mathbf{x}\right)\right]^\mathrm{T} \nabla f\left(\mathbf{x}\right)$ approximates the Hessian matrix of $f$. I do not understand this at all. How does this work?

1

There are 1 best solutions below

0
On BEST ANSWER

Let $\phi(x) = f(x)^2$, where $f : \mathbb{R}^n \to \mathbb{R}$.

Then ${\partial^2 \phi(x) \over \partial x^2} = 2 \left( f(x){\partial^2 f(x) \over \partial x^2} + {\partial f(x) \over \partial x}^T {\partial f(x) \over \partial x} \right)$.

Gauss Newton methods approximate ${\partial^2 \phi(x) \over \partial x^2}$ by the term $2{\partial f(x) \over \partial x}^T {\partial f(x) \over \partial x}$, and Levenberg Marquardt methods approximate ${\partial^2 \phi(x) \over \partial x^2}$ by the term $2 \left( f(x)D + {\partial f(x) \over \partial x}^T {\partial f(x) \over \partial x} \right)$ where $D>0$ is typically some diagonal matrix.