Structured rank-$2$ approximation of a symmetric matrix

891 Views Asked by At

I am looking for a method to solve the following optimization problem

\begin{array}{c} \min \limits_{\mathbf{x},\mathbf{y} \in \mathbb{R}^{n}} \hspace{4mm} \|A - \mathbf{x}\mathbf{x}^{\top} -\mathbf{y}\mathbf{y}^{\top} \|_{\text{F}}^2\\ \end{array}

where $A$ is an $n \times n$ real symmetric matrix and $\| \cdot \|_{\text{F}}$ denotes the Frobenius norm.

I have tried to solve this by derivating the objective and I have found the following equations:

$$ A\mathbf{x}-\| \mathbf{x}\|^2\mathbf{x}-(\mathbf{y}^{\top}\mathbf{x})\mathbf{y} = 0$$ $$ A\mathbf{y}-\| \mathbf{y}\|^2\mathbf{y}-(\mathbf{y}^{\top}\mathbf{x})\mathbf{x} = 0.$$

However, this nonlinear system seems to be not easy to solve!

Any ideas to tackle this problem? Thanks for any help!

2

There are 2 best solutions below

3
On BEST ANSWER

From the first order conditions it follows that $x$ is either 0 or an eigenvector of $A-yy^T$ with a nonnegative eigenvalue (and vice versa for $y$).

My first guess is to use the spectral decomposition $A = \sum_i \lambda_i v_i v_i^T$ with orthonormal eigenvectors and $\lambda_1 \geq \lambda_2 \geq ... \geq \lambda_n$, and to check if $x = \sqrt{\lambda_1}v_1$ and $y = \sqrt{\lambda_2}v_2$ is optimal (take $x=0$ if $\lambda_1 < 0$ and $y=0$ if $\lambda_2 <0$). This solution satisfies the first order conditions (those listed in the question seem correct to me), so we may be on the right track.

We need to prove that $$||A-\lambda_1v_1v_1^T-\lambda_2v_2v_2^T ||_F^2 \leq ||A-xx^T - yy^T||_F^2$$ for any other solution $x,y$. This simplifies to: $$ -\lambda_1 - \lambda_2 \leq -2x^TAx - 2y^TAy + (x^Tx)^2 + (y^Ty)^2 + 2(x^Ty)^2$$ I do not see why this is true. Maybe someone can take it from here.

2
On

The best approximation of a given rank can usually be expressed by the singular value decomposition. By using the vectors associated with the largest singular values, you can easily identify those vectors.

However, your case is not that general, since you are using $xx^T$ and not $uv^T$ (which is the case, when using the SVD). It works just fine here, if your matrix $A$ is symmetric and positive semidefinite. This implies that its singular values are its eigenvalues, with its eigenvectors as both left and right singular vectors.

Thus, get the eigendecomposition of $A$ and choose the vectors corresponding to the the largest eigenvalues. These (appropriately scaled) are you $x$ and $y$. This does not work that easily, when your matrix is negative-definite or indefinite.