Finding zero of a matrix function via trust region subproblem

109 Views Asked by At

I have an issue with some steps carried out in a paper (https://arxiv.org/abs/1508.07497 sec 9.2.2)

In order to perform a gradient descent step, we need to solve the following equation for the square matrix $B$. Notice the difficulty is the division by it's F-norm.

$B(ZZ^T + \frac{\lambda}{||B||_F}I) = RZ^{T}$

It is noted that the term in brackets is positive semi-definite, which apparently means it is possible to create a trust region sub problem (why?)

The paper proceeds by finding a vectorized version of this equation

$b^T(X + \frac{\lambda}{||B||_F}I) = p^T$

Where $b = vec(B)$, $X = ZZ^T \otimes I$, $p = vec(RZ^T)$

It then claims that this results in the trust-region sub problem

$\underset{b}{\text{min}} \frac{1}{2}b^T X X^T b + p^T b$

$\text{subject to} ||b||_2 \le \Delta$

Where $\Delta \ge 0$ is the trust-region radius.

I understand (although have little experience) what a trust-region optimization method is. But I don't see how this optimization problem allows us to find $B$ to solve the first equation. I first thought that it was minimizing

$||B(ZZ^T + \frac{\lambda}{||B||_F}) - RZ^T||$

In order to find a zero, but after trying to check this I'm almost certain it is not the case. How does solving this trust region problem help?