Let $X$ be an $n \times n$ Hermitian matrix, it follows that we can write $X=A+iB$ where $A$ is symmetric and $B$ is skew-symmetric. Let $S$ be the set of $n \times n$ symmetric matrices and define the Schatten $1$-norm (also known as trace norm) as $\lVert M \rVert_1=\sum_j \sigma_j(M)$, where $ \sigma_j(M)$ are the singular values of $M$. To measure how close $X$ is to the subspace of symmetric the matrices, we can define the distance measure $$ D(X) = \min_{Y\in S} \lVert X-Y \rVert_1 .$$ Expanding $X$ into symmetric and skew-symmetric parts, we get that $$D(X)= \min_{Y\in S} \lVert (A-Y)+iB \rVert_1$$ From this it seems intuitively that the minimum occurs when $Y=A$, in which case $D(X)=\lVert B\rVert_1$ is simply the $1$-norm of the imaginary part of $X$. How would one show that this quantity is minimized for $A=Y$?
Minimizing the Schatten 1-norm over symmetric matrices.
624 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 2 best solutions below
On
If the distance is measured using the Frobenius norm then the problem is almost trivial.
(This post works through the exercise that Josse suggested at the end of his answer)
Use a real symmetric matrix $A=A^T$ and a real skew-symmetric matrix $B=-B^T$ to construct the hermitian matrix $$\eqalign{ X &= A + iB \quad\implies\quad X = X^H,\quad X^T = X^* \\ }$$
Use an unconstrained matrix $U\in{\mathbb C}^{n\times n}$ to construct the complex symmetric matrix $$\eqalign{ Y &= {\rm Sym}(U) \doteq \tfrac 12(U+U^T) \quad\implies\quad Y = Y^T,\quad Y^H = Y^* \\ }$$ Calculate the gradient of the objective with respect to $U$ $$\eqalign{ \phi &= \big\|X-Y\big\|_F^2 \;=\; (X-Y)^*&:(X-Y) \\ d\phi &= (Y-X)^*:dY &\quad+\; conj \\ &= (Y-X)^*:{\rm Sym}(dU) &\quad+\; conj \\ &= {\rm Sym}(Y-X)^*:dU &\quad+\; conj \\ &= (Y-\tfrac 12(X+X^*))^*:dU &\quad+\; conj \\ &= (Y-{\cal Re}(X))^*:dU &\quad+\; conj \\ \frac{\partial\phi}{\partial U} &= (Y-{\cal Re}(X))^* \quad& \frac{\partial\phi}{\partial U^*} = Y-{\cal Re}(X) \\ }$$ where "+ $conj\,$" denotes the complex conjugate of the first half of the expression.
Anyway, setting either gradient to zero yields $$\eqalign{ Y &= {\cal Re}(X) = A \\ }$$
In the above, a colon is used to denote the trace product, i.e. $$\eqalign{ A:B &= {\rm tr}(A^TB) = \sum_{i=1}^m\sum_{j=1}^n A_{ij}B_{ij} \;=\; B:A \\ X^*:X &= {\rm tr}(X^HX) \;\doteq\; \big\|X\big\|_F^2 \\ }$$ It is interesting that the result obtained using the Frobenius norm is apparently the same as a much more difficult derivation involving the Nuclear norm.
Edit: It seems that I let myself get carried away by specifics, and as a consequence completely missed the bigger picture. My old answer is kept at the bottom for historical purposes. Let it be a reminder to all — including myself — that accepted answers on this site can occasionally be clumsy, needlessly complicated, or potentially even incorrect (though the latter doesn't seem to apply here).
$$ {} $$
New answer — much shorter and more general:
We present the result in a much more general setting (which is probably closer to your intuition).
Now all that remains is to show that the entry-wise complex conjugation $\bar{\ }\, : M_n(\mathbb{C}) \to M_n(\mathbb{C})$ preserves the trace norm. This is not so hard to do. For arbitrary matrices $A,B \in M_n(\mathbb{C})$ we have $\overline{AB} = \overline A \, \overline B$ and $\overline{A}^* = \overline{A^*}$, hence $$ \overline{A}^*\,\overline A \: = \: \overline{A^*A}. $$ Now it is easy to see that $\overline{|A|}$ is a positive semidefinite square root of $\overline{A^*A}$, so it follows that $\overline{|A|} = |\,\overline A\,|$ holds. Finally, we find $$ \lVert \overline A\rVert_1 \: = \: \text{tr}\big(\big|\,\overline A\,\big|\big) \: = \: \text{tr}\left(\overline{|A|}\right) \: = \: \overline{\text{tr}(|A|)} \: = \: \overline{\lVert A\rVert_1} \: = \: \lVert A\rVert_1. $$
$$ {} $$
In summary: the entire result follows essentially from one application of the triangle inequality.
$$ {} $$
Old answer — needlessly complicated:
Your intuition is correct,
but it's not particularly easy to prove this. We will need quite a few facts about the trace norm. Let $\mathbb{C}^{n\times n}$ denote the algebra of $n\times n$ complex matrices, and let $\lVert \:\cdot\: \rVert_\infty : \mathbb{C}^{n\times n} \to \mathbb{R}_{\geq 0}$ denote the operator norm (also know as the Schatten $\infty$-norm).Now that we have all the ingredients, let's get on with the exercise at hand. Recall that we have $X = A + iB$ with $A$ real symmetric and $B$ real skew-symmetric, and we want to find a real symmetric matrix $Y$ minimising the distance $\lVert X - Y\rVert_1$.
Assume without loss of generality that $A = 0$ holds, so that we have $X = iB$. We prove that the minimum occurs at $Y = 0$. Choose some $U \in \mathbb{C}^{n\times n}$ satisfying $\lVert U\rVert_\infty \leq 1$ as well as $\lVert X\rVert_1 = \text{tr}(XU)$. Now we have $$ \text{tr}(XU) = \text{tr}\big((XU)^T\big) = \text{tr}\big(U^TX^T\big) = -\text{tr}\big(U^TX\big) = -\text{tr}\big(XU^T\big), $$ so we may assume without loss of generality that $U$ is skew-symmetric (replace $U$ by $\tfrac{1}{2}(U - U^T)$, if neccesary, then the properties $\lVert U\rVert_\infty \leq 1$ and $\lVert X\rVert_1 = \text{tr}(XU)$ still hold).
Now let $Y$ be an arbitrary real symmetric matrix, then we have $$ \text{tr}(YU) = \text{tr}\big((YU)^T\big) = \text{tr}\big(U^TY^T\big) = \text{tr}\big(U^TY\big) = -\text{tr}(UY) = -\text{tr}(YU), $$ hence $\text{tr}(YU) = 0$. As a result, by corollary 4 we have $$ \lVert X - Y\rVert_1 \geq \big|\text{tr}\big((X - Y)U\big)\big| = |\text{tr}(XU) - \text{tr}(YU)| = \big|\lVert X\rVert_1 - 0\big| = \lVert X\rVert_1, $$ which proves the result.
Closing remarks: