Schatten norms Triangle inequality

576 Views Asked by At

i want to solve the following exercise:

Show that the p-Schatten norms are indeed norms on the space of Hermitian matrices.

So far I have proved the first two norm properties as follows. Proof:

  1. Show $||0_M||=0$:
    The null matrix does not have singular values, it follows
    $||0_M||=(\sum_j\delta_j(0_M)^p)^{\frac{1}{p}}=0$
  2. Show $||cM||=|c|||M||$:
    Let $\delta_j(M)$ be an arbitrary singular value of $M$. It immediately follows per Definition that $|c_j|\delta_j(M)$ is a singular value of the matrix $cM$. Hence
    $||cM||_p=(\sum_j\delta_j(cM)^p)^{\frac{1}{p}}=|c|(\sum_j\delta_j(M)^p)^{\frac{1}{p}}=|c|||M||_p$.

My question: I now need help proving the third property, the triangle inequality. Probably this can be done with the use of the Lidskii and Hölder inequality, but unfortunately I don't know how to do it.

Thanks :)

2

There are 2 best solutions below

1
On

I give the proof below for arbitrary matrices in $\mathbb C^{n\times n}$. (With zero padding this implies the result for non-square matrices as well.) In all cases I assume singular values are always ordered $\sigma_1\geq \sigma_2\geq...\geq \sigma_n\geq 0$

Let $\Sigma_X$ be the $n\times n$ diagonal matrix containing the singular values of $X$ (with above ordering of course). The structure of the proof is
$\big\Vert A+B\big\Vert_{S_p}\leq \big\Vert \Sigma_A+\Sigma_B\big\Vert_{S_p}\leq \big\Vert \Sigma_A\big\Vert_{S_p}+\big\Vert\Sigma_B\big\Vert_{S_p}=\big\Vert A\big\Vert_{S_p}+\big\Vert B\big\Vert_{S_p}$
The second inequality is immediate -- $\big\Vert\Sigma_X\big\Vert_{S_p}=\big\Vert\text{vec}(\Sigma_X)\big\Vert_{p}$ so it is inherited from triangle inequality for $L_p$ norms

remaining claim :
$\big\Vert A+B\big\Vert_{S_p}\leq \big\Vert \Sigma_A+\Sigma_B\big\Vert_{S_p}$

(i) $\Sigma_{A+B}\preceq_w\big(\Sigma_A+\Sigma_B\big)$
where $\preceq_w$ denotes weak majorization. (Proven at the end.)

(ii) $x\mapsto x^p$ is a convex and increasing function for $x\geq 0$ and $p\geq 1$ (check 1st two derivatives)

(iii) combining (i) and (ii) tells us
$\big\Vert A+B\big\Vert_{S_p}^p=\big\Vert \Sigma_{A+B}\big\Vert_{S_p}^p\leq \big\Vert \Sigma_A+\Sigma_B\big\Vert_{S_p}^p\implies \big\Vert A+B\big\Vert_{S_p}\leq \big\Vert \Sigma_A+\Sigma_B\big\Vert_{S_p}$
where the implication follows by taking $p$th roots
If you are unfamiliar with applying functions that are Schur convex and increasing in context of weak majorizations, the standard reference is Inequalities: Theory of majorization and its applications by Marshall and Olkin. (That said you can figure this out yourself if you have a good enough grasp of Schur convexity and regular (strong) majorization.)

proof of (i):
let $(A+B)$ have Polar Decomposition
$(A+B) = UP = UQ\Sigma_{(A+B)} Q^*$
for $r\in\big\{1,2,...,n\big\}$

$S_r:= Q\left(\begin{bmatrix} I_r &\mathbf {0}\\ \mathbf {0} &\mathbf {0}\end{bmatrix}Q^*\right)U^*$

$\sum_{k=1}^r \sigma_k^{(A+B)}=\text{trace}\big(S_r(A+B)\big)=\text{trace}\Big(S_rA\Big)+\text{trace}\Big(S_rB\Big)$
$\leq \Big(\sum_{k=1}^r \sigma_k^{(A)}\cdot 1\Big)+\Big(\sum_{k=1}^r \sigma_k^{(B)}\cdot 1\Big)= \sum_{k=1}^r \big(\sigma_k^{(A)}+\sigma_k^{(B)}\big)$
by von-Neumann trace inequality. This proves the weak majorization and completes the proof.

0
On

I'll give another proof but this time using a method that does not rely on majorization.

Again the underlying idea is to map these matrices to diagonal matrices $\Sigma_{A}$ and $\Sigma_{B}$ containing the singular values, and from there you inherit the desired result from regular coordinate vector $L_p$ norms. I take for granted Hölder's Inequality for Schatten norms, i.e. for matrices in $\mathbb C^{n\times n}$

$\big \vert\text{trace}\big(X^*Y\big)\big \vert \leq \Big \Vert X\Big\Vert_{S_q}\cdot \Big \Vert Y\Big\Vert_{S_p}$ where $p,q$ are Hölder conjugates.
(Easy proof: use von Neumann Trace Inequality, then Hölder for vectors.)

Now we use quasi-linearization and observe
$\Big \Vert Y\Big\Vert_{S_p} = \max_{X: \Vert X\Vert_{S_q} =1}\big\vert\text{trace}\big(X^*Y\big)\big \vert$
This comes from meeting the above Hölder's Inequality with equality, in particular with

$X:= UQ\Sigma_X Q^*$
where $Y$ has Polar Decomposition $Y=UP=UQ\Sigma_YQ^*$ hence
$\big\vert\text{trace}\big(X^*Y\big)\big \vert= \big\vert\text{trace}\big(\Sigma_X^*\Sigma_Y\big)\big \vert =\big\vert\text{trace}\big(\Sigma_X\Sigma_Y\big)\big \vert = \big \vert \sum_{k=1}^n \sigma_k^{(X)}\cdot \sigma_k^{(Y)}\big \vert $
and Hölder's Inequality for coordinate vectors tells us the required equality conditions for choosing $\Sigma_X$

Main proof:
$\Big \Vert (A+B)\Big\Vert_{S_p}$
$= \max_{X: \Vert X\Vert_{S_q} =1}\big\vert\text{trace}\big(X^*(A+B)\big)\big \vert$
$= \max_{X: \Vert X\Vert_{S_q} =1}\big\vert\text{trace}\big(X^*A\big)+ \text{trace}\big(X^*B\big)\big \vert$
$\leq \max_{X: \Vert X\Vert_{S_q} =1}\Big\{\big\vert\text{trace}\big(X^*A\big)\big\vert+ \big\vert\text{trace}\big(X^*B\big)\big \vert\Big\} $
$\leq \max_{X_1: \Vert X_1\Vert_{S_q} =1}\big\vert\text{trace}\big(X_1^*A\big)\big\vert+ \max_{X_2: \Vert X_2\Vert_{S_q} =1}\big\vert\text{trace}\big(X_2^*B\big)\big \vert $
$=\Big \Vert A\Big\Vert_{S_p}+\Big \Vert B\Big\Vert_{S_p}$
where the inequalities are (i) triangle inequality for scalars and (ii) 2 choices are better than one