Sum of k-largest eigenvalues of a symmetric matrix as an SDP

5.7k Views Asked by At

I found the following statement from a google search. If $S_k(\mathbf{X})$ is the sum of the $k$ largest eigenvalues of a symmetric $m\times m$ matrix $\mathbf{X}$, then,$$S_k(\mathbf{X}) \leq t$$ is true iff

$$ t - k s - \mathrm{trace}(\mathbf{Z}) \geq 0 \\ \mathbf{Z} \geq 0 \\ \mathbf{Z} - \mathbf{X} + s\mathbf{I}_m \geq 0$$, where $s$ hasn't been described. I tried with $s$ as a positive variable in CVX and it worked and I could calculate the $k$ largest eigenvalues of $\mathbf{X}$.

I seem to be going around in circles trying to prove this one. I would like to prove it on my own. Can anyone provide a starting direction or the first few lines of the proof.

Thank you.

1

There are 1 best solutions below

3
On BEST ANSWER

We'll start with the following expression for $S_k(\mathbf{X})$ as a maximum sum over $k$ orthonormal vectors: $$\begin{gathered} \max_{\mathbf{v}_1\ldots\mathbf{v}_k} \sum_{i=1}^k \mathbf{v}_i^T \mathbf{X}\mathbf{v}_i\\ \mathbf{v}_i^T \mathbf{v}_j = \begin{cases} 1 & i = j \\ 0 & i \ne j\end{cases} \end{gathered}$$ If we collect the vectors in the $k \times m$ matrix $\mathbf{V}$, in matrix notation this is: $$\begin{gathered} \max_{\mathbf{V}} \text{ trace}(\mathbf{V}^T \mathbf{X} \mathbf{V})\\ \mathbf{V}^T \mathbf{V} = \mathbf{I}_k \end{gathered}$$ Since this is a nonconvex problem, we'll consider a convex relaxation. Note that $\mathbf{V}^T \mathbf{V} = \mathbf{I}_k$ only if $\mathbf{0} \preceq \mathbf{V} \mathbf{V}^T \preceq \mathbf{I}$ and $\text{trace}(\mathbf{V} \mathbf{V}^T)=k$. So letting $\mathbf{A}=\mathbf{V} \mathbf{V}^T$, we consider the following SDP which is a convex relaxation of the above: $$ \begin{gathered} \max_\mathbf{A} \ \text{trace}(\mathbf{A}\mathbf{X})\\ \mathbf{0} \preceq \mathbf{A} \preceq \mathbf{I} \\ \text{trace}(\mathbf{A}) = k \end{gathered} $$ Then we take the dual (and reformulate slightly) to get: $$ \begin{gathered} \min_{\mathbf{Z},s} \ \text{trace}(\mathbf{Z})+ks\\ \mathbf{Z} \succeq \mathbf{0}\\ \mathbf{Z} + s \mathbf{I} \succeq \mathbf{X} \end{gathered} $$ Now that we have the above SDP primal-dual pair, we can prove that their optimal value is $S_k(\mathbf{X})$ just by constructing optimal points. Suppose $\mathbf{X}=\mathbf{U}\text{ diag}(\lambda)\mathbf{U}^T$, where $\mathbf{U}$ is an orthonormal matrix of eigenvectors and $\mathbf{\lambda}$ is the vector of eigenvalues sorted in decreasing order, that is $\lambda_i\ge\lambda_{i+1}$. Then let $$ \begin{aligned} \mathbf{A}^*&=\mathbf{U}\text{ diag}(\mathbf{1}_k,\mathbf{0}_{m-k})\mathbf{U}^T\\ \mathbf{Z}^*&=\mathbf{U}\text{ diag}((\lambda-\lambda_k \mathbf{1})_+)\mathbf{U}^T\\ s^*&=\lambda_k \end{aligned} $$ Here, $(\cdot)_+$ is the positive part of a vector (it sets the negative components to zero). One can check that these points are feasible for their respective problems. Also we check the values of the objectives: $$ \begin{aligned} \text{trace}(\mathbf{A}^*\mathbf{X})&=\text{trace}(\mathbf{U}\text{ diag}(\lambda)\mathbf{U}^T\mathbf{U} \text{ diag}(\mathbf{1}_k,\mathbf{0}_{m-k})\mathbf{U}^T)\\ &=\text{trace}(\text{diag}(\lambda)\text{ diag}(\mathbf{1}_k,\mathbf{0}_{m-k}))\\ &=\sum_{i=1}^k \lambda_i\\ &=S_k(\mathbf{X})\\ \text{trace}(\mathbf{Z}^*)+ks^* &= k\lambda_k +\sum_{i=1}^m(\lambda_i-\lambda_k)_+\\ &= \sum_{i=1}^k((\lambda_i-\lambda_k)_+ + \lambda_k)\\ &=S_k(\mathbf{X}) \end{aligned} $$ Recall that all dual feasible points give a bound for the primal and vice-versa. Since these points give the same value of the objective for their respective problems, they must be optimal. Therefore $S_k(\mathbf{X})$ is the optimal value for both SDPs.