Maximal trace of maps induced on $\Bbb R^d$ by a symmetric $n\times n$ matrix $M$ is the sum of the $d$ largest eigenvalues of $M$.

56 Views Asked by At

Prove that if $M \in \mathbb R^{n \times n}$ is symmetric and $d \leq n$ then:

$$\max_{\substack{U \in \mathbb R^{n \times d} \\ U^T U = I_{d \times d}}} \operatorname{Tr}(U^T MU)= \sum_{k=1}^{d} \lambda_k^{(+)}(M)$$

where $\lambda_k^{(+)}$ is the kth largest eigenvalue of $M$ (i.e., $\lambda_1^{(+)} \geq \lambda_2^{(+)} \geq \dots$).

How can I solve this question?

1

There are 1 best solutions below

0
On

$M$ being symmetric with real entries, there is an orthogonal matrix $P$ such that $N:=PMP^T$ is a diagonal matrix.

Note that both sides of the demanded equality assume the same value when we replace $M$ by $N$ (using the change of variables $U=PV$ for the left-hand side, and using the fact that $M$ and $N$ have the same spectrum for the right-hand side) therefore it is enough to prove the equality for diagonal matrices. Also note that with the help of a permutation matrix you can also assume that the diagonal entries are sorted from the highest to the lowest.

Now if $M$ is diagonal, with entries decreasing from the upper left ($\lambda_1$) to the lower right ($\lambda_n$), then the product $MU$ is the matrix $U$ where the $i$-th row has been multiplied by $\lambda_i$, and so for every $i$.

When you consider the $i$-th diagonal entry of $U^TMU$, it appears to be an affine combination of the eigenvalues of $M$, with weights given by the squares of the entries of the $i$-th column of $U$. These squares do add up to $1$ by assumption ($U^TU=\Bbb 1$), which allows us to speak about an affine combination.

This explains why individually, no diagonal entry of $U^TMU$ can be bigger than $\lambda_1$, and incidentally solves the case $d=1$. Now when $d>1$, there is a little trick to make use of the fact that the columns of $U$ are orthogonal. Let $U=(u_{ij})$. Let us rewrite $$\operatorname{Tr}(U^TMU)=\sum_{i=1}^n\lambda_i\sum_{j=1}^du_{ij}^2$$ as $$\operatorname{Tr}(U^TMU)=\sum_{i=1}^n\lambda_{d+1}\sum_{j=1}^du_{ij}^2+\sum_{i=1}^n(\lambda_i-\lambda_{d+1})\sum_{j=1}^du_{ij}^2$$ The first term is equal to $$\begin{array}[rcl] .\sum_{i=1}^n\lambda_{d+1}\sum_{j=1}^du_{ij}^2&=&\sum_{i=1}^n\lambda_{d+1}\sum_{j=1}^du_{ij}^2\\&=&\lambda_{d+1}\sum_{j=1}^d\sum_{i=1}^nu_{ij}^2\\&=&\lambda_{d+1}\sum_{j=1}^d1\\ &=&d\lambda_{d+1}\end{array}$$ The second term is equal to $$\begin{array}[rcl] .\sum_{i=1}^n(\lambda_i-\lambda_{d+1})\sum_{j=1}^du_{ij}^2&=&\left(\sum_{i=1}^d(\lambda_i-\lambda_{d+1})\sum_{j=1}^du_{ij}^2\right)+\left(\underset{\text{ since $\lambda_i\leq\lambda_{d+1}$ when $i\geq d+1$}}{\text{something non-positive}}\right)\\&\leq &\sum_{i=1}^d(\lambda_i-\lambda_{d+1})\sum_{j=1}^du_{ij}^2\end{array}$$

Now the trick is that since the columns of $U$ are unit and pairwise orthogonal, $U$ can be completed into an orthogonal matrix, and being orthogonal works both ways: the rows of the completed $U$ will be unit vectors. Therefore, for all $i$: $$\sum_{j=1}^du_{ij}^2\leq 1$$ and it follows that our second term is less than or equal to: $$\sum_{i=1}^d(\lambda_i-\lambda_{d+1})=\left(\sum_{i=1}^d \lambda_i\right)-d\lambda_{d+1}$$

Combining both terms:

$$\operatorname{Tr}(U^TMU)\leq \sum_{i=1}^d \lambda_i$$ as required.

The fact that the maximum value over all $U$ is exactly $\sum_{i=1}^d \lambda_i$ is clear, since all you need to reach it is to choose $U$ to be the $d\times d$ identity matrix stacked on top of rows of $0$'s.