About trace duality and other two bounds...

83 Views Asked by At

I've been reading this paper of Koltchinskii, Tsybakov and Lounici about nuclear norm penalization. Let me give you the context for the next three questions regarding Theorem 1 (I don't think that there is a need to read the theorem):

Let $A\in\mathbb{R}^{m_1\times m_2}$ be a rectangular matrix with $r=rank(A)\leq \min(m_11,m_2)$. The SVD of $A$ has the form $A=\sum_{j=1}^r\sigma_j(A)u_jv_j^{\top}$ with orthonormal vectors $u_1,...,u_r\in\mathbb{R}^{m_1}$ and (other) ortonormal vectors $v_1,...,v_r\in\mathbb{R}^{m_2}$, and real numbers $\sigma_1(A)\geq \cdots\geq \sigma_r(A)> 0$. The pair of linear spaces $(S_1,S_2)$ are $S_1=span\{u_1,...,u_r\}$ and $S_2=span\{v_1,...,v_r\}$. We denote $S_j^{\perp}$ the orthogonal complement of $S_j$, $j=1,2$, and by $P_S$ the projector on the linear subspace $S$ of $\mathbb{R}^{m_j}$, $j=1,2$.

The Schatten-$p$ (quasi-)norm $||A||_p$ of $A$ is \begin{align} ||A||_p = \left(\sum_{j=1}^{\min(m_1,m_2)}\sigma_j(A)^p\right)^{1/p}. \end{align}

Define the inner product $\langle A,B \rangle = tr(A^{\top}B)$.

Take as a known fact the trace duality property: \begin{align} |tr(A^{\top}B)|\leq ||A||_1||B||_{\infty}. \end{align} Also, suppose that we know that the subdifferential of the convex function $A\mapsto ||A||_1$ is \begin{align} \partial ||A||_1 = \left\{\sum_{j=1}^ru_jv_j^{\top}+P_{S_1^{\perp}}WP_{S_2^{\perp}}:||W||_{\infty}\leq 1\right\}. \end{align}

Here are the three questions:

First: Let $A,B\in\mathbb{R}^{m1\times m_2}$ and $V\in\partial||A||_{1}$ of the form $V=\sum_{j=1}^ru_jv_j^{\top}+P_{S_1^{\perp}}WP_{S_2^{\perp}}$. The authors say that by trace duality, there exist some $W$ with $||W||_{\infty}\leq 1$ such that \begin{align} \langle P_{S_1^{\perp}}WP_{S_2^{\perp}},B-A \rangle = \langle P_{S_1^{\perp}}WP_{S_2^{\perp}},B \rangle = \langle W,P_{S_1^{\perp}}BP_{S_2^{\perp}} \rangle = ||P_{S_1^{\perp}}BP_{S_2^{\perp}}||_1. \end{align} I understand perfectly the first two equalities, but I don't know why the last one is correct. Trace duality indicates an inequality. I think that they are using that there exists an specific and explicit for of $W$ such that we have equality in the trace duality property.

Second: Let $M\in\mathbb{R}^{m_1\times m_2}$ and $P_A(M) = M-P_{S_1^{\perp}}M P_{S_2^{\perp}}$. The authors say that $$||P_A(M)||_2\leq \sqrt{rank(P_A(M))}||M||_{\infty}.$$ I understand they use that $||A||_2\leq \sqrt{rank(A)}||A||_{\infty}$, but I think that there is a 2 missing in the inequality. They give the help $P_A(M) = P_{S_1^{\perp}}MP_{S_2}+P_{S_1}M$ and $rank(P_S)\leq rank(A)$.

Third: Let $A$ and $B$ as in First. The authors say that $||P_{S_1}(B-A)P_{S_2}||_2 \leq ||B-A||_2$. Maybe they used the submultiplicity of Schatten norms, but $||P_{S_j}|| \leq \sqrt{r}$.

I know that it is a lot of information, so thanks in advance. I do not expect complete answers for all three. Instead, I will appreciate some insight into at least one of the questions.