Newton method with approximate Hessian? Choose a descent direction if only know the eiegnvalues of the hessian

113 Views Asked by At

The question I am struggling with is 'Suppose that it is easy to compute the eigenvalues of the Hessian of f. Briefly describe an alternative way to compute a suitable matrix $\boldsymbol{B}_{k}$. '

--

The overall optimization problem is as follows

$$ \min _{\boldsymbol{x} \in \mathbb{R}^{n}} f(\boldsymbol{x}) $$ Where $f: \mathbb{R}^{n} \rightarrow \mathbb{R}$ is a twice continuously differentiable function. The problem is a Newton method variant.

The descent direction $\Delta \boldsymbol{x}_{k}=-{\boldsymbol{B}_{k}}^{-1}\nabla \boldsymbol{f}\left(\boldsymbol{x}_{k}\right)$.

Update: Determine a scalar step size $t_{k}>0$ and update the solution,

$ \boldsymbol{x}_{k+1}=\boldsymbol{x}_{k}+t_{k} \Delta \boldsymbol{x}_{k} $

$ \boldsymbol{B}_{k}=\left\{\begin{array}{ll} \nabla^{2} f\left(\boldsymbol{x}_{k}\right) & \text { if } \nabla^{2} f\left(\boldsymbol{x}_{k}\right) is\ positive \ definite\\ \boldsymbol{I} & \text { otherwise } \end{array}\right. $

--

Now f(x) is not necessarily convex, I guess, but the ${B}_{k}$ chosen above ensures the descent direction is negative. Now I'm unsure what the question wants.

My thinking is to guess

$ \boldsymbol{B}_{k}$ = $diag(\lambda_{i})$ if all $\lambda_{i}$ are positive. Else the identity. As this is the original, as you still know exactly when the Hessian is positive definite. So $ {\boldsymbol{B}_{k}}^{-1}$ = $\Sigma^{-1}$ instead of $U\Sigma^{-1} U^{T}$

But I don't know really why, as U is orthonormal and you just know the eigenvalues not what eigenvector it corresponds to. So you are essentially just randomly scaling the maximal descent direction.

Can anyone help me? and clear up what's going on.