Polar set of orthogonal matrices set is nuclear norm ball

362 Views Asked by At

Reltated problems:
Show that the dual norm of the spectral norm is the nuclear norm
Prove that the nuclear norm is convex


The set of orthogonal matrices is defined as:

$$\mathcal{O}(n) = \{X\in \mathbb{R}^{n\times n}:X^TX=I\}$$

The polar of $\mathcal{O}(n)$:

$$\mathcal{O}(n)^o = \{Y\in \mathbb{R}^{n\times n}:\langle Y,X\rangle\leq 1, \forall X\in \mathcal{O}(n)\}$$

i.e., the set of linear functionals that take value at most one on $\mathcal{O}(n)$. The definition of polar cone is the general definition.

How to prove the polar of $\mathcal{O}(n)$ is the nuclear norm ball?
i.e. $$\mathcal{O}(n)^o = \{Y\in \mathbb{R}^{n\times n}:\sum_{i=1}^n\sigma_i(Y)\leq 1\}$$


I try to consider how to go to $\sum\sigma_i\leq1$ from $\langle Y,X\rangle=\text{tr}(YX)\leq 1$; however, I cannot find a way to break through.

By user1551's suggestion:

Let $YX=U\Sigma V^T$, where $Y\in \mathcal{O}(n)^o, X\in \mathcal{O}(n)$.

By Convex hull of orthogonal matrices, can I say:

$\|YX\|_2^2=\langle Y,X \rangle \leq 1$ so I get $\Sigma$'s diagonal elements $\in [0,1]^n$. If this is true, how to say the SVD of $Y$, particularly $\Sigma(Y)$?

Note: $\|\cdot\|_2$ is the spectral norm (the largest singular value).

1

There are 1 best solutions below

0
On

I will call $\mathcal{A} = \{Y\in \mathbb{R}^{n\times n}:\langle Y,X\rangle\leq 1, \forall X\in \mathcal{O}(n)\}$ and $\mathcal{B} = \{Y\in \mathbb{R}^{n\times n}:\sum_{i=1}^n\sigma_i(Y)\leq 1\}$, denote the nuclear norm by $\|\cdot\|_1$ and the operator norm by $\|X\|_{\infty}$ (the indexing comes from viewing them as special cases of the Schatten norms).

Part 1: $\mathcal{B} \subseteq \mathcal{A}$

This direction follows from the Hölder's inequality. Let $Y \in \mathcal{B}$ and $X \in \mathcal{O}(n)$ then by Hölder's inequality we have $$ \begin{aligned} \langle Y, X \rangle &\leq \|Y\|_1 \|X\|_{\infty}\\ &= \|Y\|_1\\ &\leq 1. \end{aligned} $$ Thus $Y \in \mathcal{A}$ and $\mathcal{B} \subseteq \mathcal{A}$.

Part 2: $\mathcal{A} \subseteq \mathcal{B}$

This direction is a little more tricky. To solve it we need the following lemma which gives a variational characterization of the nuclear norm.

Lemma: Let $Y \in \mathbb{R}^{n\times n}$ then $$ \|Y\|_1 = \sum_{i=1}^n\sigma_i(Y) = \max_{X \in \mathcal{O}(n)} \langle Y,X\rangle.U $$ Proof$\quad$ Consider the singular value decomposition $Y = UDV$ where $U,V \in \mathcal{O}(n)$ and $D$ is the diagonal matrix containing the singular values of $Y$. Then let $X = UV$ and we see $$ \begin{aligned} \langle Y, X\rangle &= \mathrm{Tr}[Y^T X] \\ &= \mathrm{Tr}[V^T D U^T UV] \\ &= \mathrm{Tr}[VV^T D U^T U] \\ &= \mathrm{Tr}[D] \\ &= \sum_{i=1}^n \sigma_i(Y). \end{aligned} $$ Thus $\sum_i \sigma_i(Y) \leq \max_{X \in \mathcal{O}(n)} \langle Y,X\rangle$. The other inequality follows directly from the Hölder's inequality used in Part 1, i.e. $\langle Y, X\rangle \leq \|Y\|_1$ for any $X \in \mathcal{O}(n)$. $$\tag*{$\blacksquare$}$$

Now the result follows fairly quickly. Given $Y \in \mathcal{A}$ we know $\langle Y , X \rangle \leq 1$ for all $X \in \mathcal{O}(n)$. However, from the above lemma we also know that $$ \begin{aligned} \sum_{i=1}^n \sigma_i(Y) &= \max_{X \in \mathcal{O}(n)} \langle Y,X\rangle \\ &\leq 1 \end{aligned} $$ Thus $Y \in \mathcal{B}$ and $\mathcal{A} \subseteq \mathcal{B}$ and we are done!