Matrix Concentration for Products- Orthogonality Property

64 Views Asked by At

The following comes from page 3 of the paper "Matrix Concentration for Products" found here https://arxiv.org/pdf/2003.05437.pdf.

Given two independent, complex, square matrices Y and Z: $$ YZ=\mathbb{E}(Y)Z+(Y-\mathbb{E}(Y))Z.$$ I understand that $$ \mathbb{E}((Y-\mathbb{E}(Y))Z|Z)=0. $$ Now, this is called an orthogonality property (Why?) and apparently used to show: $$ \mathbb{E}(\left\| YZ\right\|_2^2) =\mathbb{E}(\left\|(\mathbb{E}(Y))Z\right\|_2^2) + \mathbb{E}(\left\| (Y-\mathbb{E}(Y))Z\right\|_2^2) $$

where $\left\| .\right\|_2$ refers to the Schatten 2-norm or Frobenius-norm.

Could someone clarify how exactly this property is used here? I assume the tower property comes into play here but I'm not sure how exactly.

Thank you.

1

There are 1 best solutions below

0
On BEST ANSWER

this problem suffers from a proliferation of symbols and parenthesis
$A:= \mathbb{E}(Y)$

$\mathbb{E}\Big(\left\| YZ\right\|_F^2\Big) $
$=\mathbb{E}\Big(\text{trace}\big((YZ)^* YZ\big)\Big) $
$=\text{trace}\Big(\mathbb{E}\big((YZ)^* YZ\big)\Big) $
$=\text{trace}\Big(\mathbb{E}\big((AZ+(Y-A)Z)^* (AZ+(Y-A)Z)\big)\Big) $
$=\text{trace}\Big(\mathbb{E}\big(((AZ)^*AZ+ (Y-A)Z)^*(Y-A)Z)\big)\Big) + \text{trace}\Big(\mathbb{E}\big(C\big)\Big)$
$=\mathbb{E}(\left\|(A)Z\right\|_F^2) + \mathbb{E}(\left\| (Y-A)Z\right\|_F^2)+ \text{trace}\Big(\mathbb{E}\big(C\big)\Big)$
$=\mathbb{E}(\left\|(A)Z\right\|_F^2) + \mathbb{E}(\left\| (Y-A)Z\right\|_F^2)+0$

where $C$ is a matrix containing the cross terms. In particular $C$ is given by
$C= \big(AZ\big)^*\big(Y-A\big)Z + \big((Y-A)Z\big)^*\big(AZ\big)$
using linearity of both expectations and trace, and that trace and expectations commute we can focus on the first term
$\text{trace}\Big(\mathbb{E}\Big(\big(AZ\big)^*\big(Y-A\big)Z\Big)\Big) $
$=\text{trace}\Big(\mathbb{E}\Big(Z^*A^*\big(Y-A\big)Z\Big)\Big)$
$=\mathbb{E}\Big(\text{trace}\Big(Z^*A^*\big(Y-A\big)Z\Big)\Big)$
$=\mathbb{E}\Big(\text{trace}\Big(\big(Y-A\big)ZZ^*A^*\Big)\Big)$
$=\text{trace}\Big(\mathbb{E}\Big(\big(Y-A\big)ZZ^*A^*\Big)\Big)=0$
because applying the Tower Property and your orthogonality relation
$\mathbb{E}\Big(\big(Y-A\big)ZZ^*A^*\Big)$
$\mathbb{E}\Big(\big(Y-A\big)ZZ^*\Big)A^*$
$=\mathbb{E}\Big[\mathbb{E}\Big(\big(Y-A\big)ZZ^*\Big\vert Z\Big)\Big]A^*$
$=\mathbb{E}\Big[\mathbb{E}\Big(\big(Y-A\big)\Big\vert Z\Big)ZZ^*\Big]A^*$
$=\mathbb{E}\Big[\mathbf 0ZZ^*\Big]A^*$
$=\mathbf 0$
so the trace is zero. The second term comprising $C$ is the conjugate transpose of the first one, so it has trace zero as well.