Optimization that involves inverse operation.

119 Views Asked by At

$\newcommand{\diag}{\operatorname{diag}}$

I have the following optimization problem:

\begin{align} \mathop{\arg\min}_\beta & \frac{1}{2} a' [ M + \diag( \beta ) \otimes I_d ]^{-1} a + 1^T\beta, \quad \text{s.t. } \beta \geq 0 \end{align}

where \begin{align} \beta &\in\mathbb{R^m } \, \text{is a column vector}\\ A &\in\mathbb{R^{m \times k}}, \,\, B\in\mathbb{R^{d\times k}}, \,\,I_d \text{ is an Identity matrix of } d\times d\\ M &= AA' \otimes I_d + I_m \otimes BB' \in\mathbb{R^{md\times md}}\quad \text{is a sparse matrix,} \end{align}

and $d = 512$, $m = 40$. Both $A$ and $B$ are dense matrix. This of course can be optimized using gradient decent, but it requires to compute the $[ M + \diag( \beta ) \otimes I_d ]^{-1} a$ in computing the gradient. I have used the left division, i.e., $[ M + \diag( \beta ) \otimes I_d ]$\ $a$ in Matlab, but it takes around 20 secs to compute. I have also tried the decomposition-base methods in SuiteSparse, which is much slower. Considering that it has hundreds of thousands of iterations to run, this is rather slow.

In this case, are there any methods to avoid left division with such big matrix? since $M$ and $a$ is fixed, doing this left division seems to be unnecessary. Looks like we can factorize $M$, but I haven't figured out a more efficient way.

Alternatively, is there any way to optimize without using the gradient decent like methods?e.g, closed form solution?

Thanks for your suggestion!