Problem understanding Quasi-Newton method: BFGS

203 Views Asked by At

I have problem understanding Quasi-Newton BFGS method.

Quasi-Newton method is based on approximation of $f$ where: $f(x_k + \Delta x) \approx f(x_k) + \nabla f(x_k)^{\mathrm T} \,\Delta x + \frac{1}{2} \Delta x^{\mathrm T} B \,\Delta x$,

where $(\nabla f)$ is the gradient, and $B$ an approximation to the Hessian matrix.

In BFGS the Hessian matrix is $B_{k+1} = B_k + \frac{\mathbf{y}_k \mathbf{y}_k^{\mathrm{T}}}{\mathbf{y}_k^{\mathrm{T}} \mathbf{s}_k} - \frac{B_k \mathbf{s}_k \mathbf{s}_k^{\mathrm{T}} B_k^{\mathrm{T}} }{\mathbf{s}_k^{\mathrm{T}} B_k \mathbf{s}_k}.$

My question is: Do we substitute BFGS Hessian matrix $B_{k+1}$ to $f(x_k + \Delta x) $ as $B$ or why do we have the first formula.

Thanks

1

There are 1 best solutions below

3
On BEST ANSWER

Yes, you put that as $B$ and then solve $s_k = \arg\min_{\Delta x} \left\{f(x_k) + \nabla f(x_k)^{\mathrm T} \,\Delta x + \frac{1}{2} \Delta x^{\mathrm T} B \,\Delta x\right\}$. Since $B$ is positive definite, the solution is obtained by setting the derivative to $0$: $s_k = -B_k^{-1} \nabla f(x_k)$.