The norm $\|S-Q\|_F$ where $Q$ is orthogonal is minimised by $Q=I$

151 Views Asked by At

Problem:

Suppose that $S$ is symmetric and semi-positive-definite. Let $\|\cdot \|_F$ be the Frobenius norm. Show that

$$\|S-I \|_F \leq \|S-Q\|_F$$

for all orthogonal matrices $Q$, where $I$ is the identity matrix.


Attempt:

So from what I know, the Frobenius norm is defined as

$$\|A\|_F:= \bigg(\sum_{i,j} a_{ij}^2\bigg)^{1/2}$$

and one of the properties of it is that $\|A\|_F = \big(\sum_i \sigma_i^2 \big)^{1/2}$ where $\sigma_i$ are the Singular Values of $A$.

Also, $\|QA\|_F = \|AQ\|_F = \|A\|_F$ for any orthogonal matrix $Q$.

Thus, if we consider the Singular Value Decomposition of $S$, say $S=UDV$ where $U,V$ are orthogonal and $D = \text{diag}(\sigma_1,\dots,\sigma_n)$, then

$$\|S-I\|_F = \|UDV - I\|_F = \|D - U^TV^T\|_F$$

but I feel like I'm not getting anyhere with this approach.

$\color{red}{\text{In particular,}}$ I don't really know how to use the fact that $S$ is symmetric and semi-positive-definite. Does this have any effect on the form of the SVD for $S$?

Any help would be much appreciated. Thanks!

2

There are 2 best solutions below

0
On BEST ANSWER

Use the fact that $\|A\|_F^2 = \operatorname{Tr}(A^TA)$. In particular, we have $$ \|S-Q\|_F^2 = \operatorname{Tr}([S-Q]^T[S-Q]) = \|S\|_F^2 + n^2 - 2 \operatorname{Tr}(Q^TS) $$ where $n$ denotes the common size of the matrices $S,Q$. With that said, it's clear that we want an orthogonal matrix $Q$ that maximizes $\operatorname{Tr}(Q^TS)$.

Because $S$ is symmetric and positive semidefinite, there exists an orthogonal matrix $V$ and a non-negative diagonal matrix $D$ with $S = VDV^T$ (regarding your question about using the properties of $S$: note that this is an SVD). Note that $$ \operatorname{tr}(Q^TVDV^T) = \operatorname{tr}([V^TQ^TV]D) = \operatorname{tr}([V^TQV]^TD). $$ In other words, we want an orthogonal matrix $W = V^TQV$ that maximizes $\operatorname{tr}(W^TD)$. We see that this maximum is achieved with $W = I$. In particular, the entries of an orthogonal matrix must all be at most $1$. So, we have $$ \operatorname{tr}(W^TD) = \sum_{i=1}^n w_{ii}d_{ii} \leq \sum_{i=1}^n d_{ii} = \operatorname{tr}(D) = \operatorname{tr}(I^TD). $$ Note that the only $Q$ for which $W = I = V^TQV$ is given by $Q = I$. The conclusion follows.

Note: This problem is an instance of the orthogonal procrustes problem.

0
On

The fact that $S$ is symmetric and positive semi-definite means it has a spectral decomposition $S = U D U^T$ where $D$ is a diagonal matrix whose entries are all non-negative, and $U$ is orthogonal. So we are looking for the minimum of $$\|U D U^T - UU^T Q\|_F = \|D U^T - U^TQ\|_F = \|D U^T - U^T Q U U^{T}\|_F = \|D - U^T Q U\|_F$$ But note that since $Q$ is any orthogonal matrix, and $U$ is also orthogonal, $U^T Q U$ can also take on any orthogonal matrix. So the problem reduces to minimizing $\|D - Q\|$, for $Q$ orthogonal. Note that since $\|A\|_F = \text{tr}(A^T A)$ for real matrices $A$, it follows that $$\|D - Q\|_F = \text{tr}((D - Q)^T (D- Q)) = \text{tr}(D^2) - \text{tr}(Q^T D + D Q) + \text{tr}(I)$$ The second equality follows because trace is linear. But note that $DQ$ and $Q^T D$ are transposes of each other, so we just must find $Q$ that maximizes $\text{tr}(Q^T D)$. But it's clear that $\text{tr}(Q^T D)$ is maximized when all entries of $Q$ are along the diagonal, and as large as possible. In particular, since the diagonal entries of an orthogonal matrix are always less than or equal to $1$, it follows that $$\text{tr}(Q^T D) \leq \text{tr}(D)$$ Note that this equality is satisfied when $Q^T = Q = I$, so indeed $\text{tr}(Q^T D)$ is maximized when $Q = I$. Therefore, the orthogonal matrix $Q$ that minimizes $\|S - Q\|_F$ is equal to $U^T I U = I$. $\square$