Prove that $\|{X}\|_{*}=\min _{{A B}={X}}\|{A}\|_{F}\|{B}\|_{F}=\min _{{A B}={X}} \frac{1}{2}\left(\|{A}\|_{F}^{2}+\|{B}\|_{F}^{2}\right)$.

176 Views Asked by At

For any matrix $X\in\mathbb{R}^{m\times n}$, I am confused with

$$\|{X}\|_{*}=\min _{{A B}={X}}\|{A}\|_{F}\|{B}\|_{F}=\min _{{A B}={X}} \frac{1}{2}\left(\|{A}\|_{F}^{2}+\|{B}\|_{F}^{2}\right),$$ where $A\in\mathbb{R}^{m\times d}$, $B\in\mathbb{R}^{d\times n}$, and $d\geq \text{rank}(X)$. $\|\cdot\|_{*}$ denotes the trace norm, and $\|\cdot\|_{F}$ denotes the F-norm.

Is there any way to prove that? This should be easy but I'm having a hard time with it. Thanks for your generous help in advance!

1

There are 1 best solutions below

0
On

So you are looking at Schatten p-norms. In other words the "F norm" as you call it is the Frobenius norm (Schatten 2-norm). The trace norm is the nuclear norm (Schatten 1 norm).

To do much with Schatten norms, you eventually need the Von Neumann trace inequality, which is that

$\Big \vert \text{trace}\big(AB\big) \Big \vert \leq \text{trace}\big(\Sigma^{(A)}\Sigma^{(B)}\big)$
where e.g. $\Sigma^{(A)}$ has the singular values of $A$ in the usual ordering from largest to smallest and the same for $\Sigma^{(B)}$

your results then are a direct application the trace inequality combined with Cauchy Schwarz (multipicative case) and $\text{GM}\leq \text{AM}$ (additive case).

main inequalities
working with scalars in $\mathbb R$, supposing you know what polar decomposition is, then, with $AB = X = UP$
$\big \Vert X\big \Vert_{S_1}$
$=\big \Vert AB\big \Vert_{S_1}$
$=\text{trace}\big(U^TAB\big)$
$=\big \vert \text{trace}\big(U^TAB\big)\big \vert$
$=\big \vert \text{trace}\big((U^TA)B\big)\big \vert$
$\leq \text{trace}\big(\Sigma^{(A)}\Sigma^{(B)}\big)$
$\leq \Big \Vert A\big \Vert_F \Big \Vert B\big \Vert_F$
$= \Big \Vert A\big \Vert_{S_2} \Big \Vert B\big \Vert_{S_2}$
where the first inequality is the von Neumann trace inequality and the second is Cauchy Schwarz. Note: multiplication by an orthogonal matrix doesn't change singular values.

for the additive case the inequalities proceed identically except the ending is
$\text{trace}\big(\Sigma^{(A)}\Sigma^{(B)}\big)$
$= \sigma_1^{(A)}\sigma_1^{(B)} + \sigma_2^{(A)}\sigma_2^{(B)}+...+ \sigma_d^{(A)}\sigma_d^{(B)}$
$= \big((\sigma_1^{(A)})^2(\sigma_1^{(B)})^2\big)^\frac{1}{2} + \big((\sigma_2^{(A)})^2(\sigma_2^{(B)})^2\big)^\frac{1}{2}+...+ \big((\sigma_d^{(A)})^2(\sigma_d^{(B)})^2\big)^\frac{1}{2}$
$\leq \frac{1}{2} \Big((\sigma_1^{(A)})^2 + (\sigma_2^{(A)})^2+...+ \big((\sigma_d^{(A)})^2\Big) + \frac{1}{2} \Big((\sigma_1^{(B)})^2 + (\sigma_2^{(B)})^2+...+ \big((\sigma_d^{(B)})^2\Big)$
$= \frac{1}{2}\Big(\big \Vert A\big \Vert_F^2 +\big \Vert B\big \Vert_F^2\Big)$
where the inequality follows by $\text{GM}\leq \text{AM}$

why it's the minimum
Consider doing SVD on $X$. You have discretion in how you do SVD-- I'd suggest letting $\Sigma$ be square (d x d) and having $U$ and/or $V^T$ be non-square as needed here for a rectangular $X$.

$X = U\Sigma V^T$ and select $A:=U\Sigma^\frac{1}{2}$ and $B:=\Sigma^\frac{1}{2}V^T$

so $X = AB$

it's immediate that $\big \Vert A\big \Vert_F^2 = \big \Vert B\big \Vert_F^2 = \big \Vert X\big \Vert_{S_1}$
and hence these inequalities are met with equality and you have achieved minimums.

addendum:
if you don't want to use the von Neumann trace inequality, you could just write

$\big \Vert X\big \Vert_{S_1}$
$=\big \Vert AB\big \Vert_{S_1}$
$=\text{trace}\big(U^TAB\big)$
$=\big \vert \text{trace}\big(U^TAB\big)\big \vert$
$=\big \vert \text{trace}\big((U^TA)B\big)\big \vert$
$\leq \big \Vert U^TA\big \Vert_F \big \Vert B\big \Vert_F$
$= \big \Vert A\big \Vert_F \big \Vert B\big \Vert_F$
$\leq \frac{1}{2}\Big( \big \Vert A\big \Vert_F^2 + \big \Vert B\big \Vert_F^2\Big)$

where the first inequality is Cauchy Schwarz and the second is $\text{GM} \leq \text{AM}$