I've been reading this paper of Koltchinskii, Tsybakov and Lounici about nuclear norm penalization. Let me give you the context for the next three questions regarding Theorem 1 (I don't think that there is a need to read the theorem):
Let $A\in\mathbb{R}^{m_1\times m_2}$ be a rectangular matrix with $r=rank(A)\leq \min(m_11,m_2)$. The SVD of $A$ has the form $A=\sum_{j=1}^r\sigma_j(A)u_jv_j^{\top}$ with orthonormal vectors $u_1,...,u_r\in\mathbb{R}^{m_1}$ and (other) ortonormal vectors $v_1,...,v_r\in\mathbb{R}^{m_2}$, and real numbers $\sigma_1(A)\geq \cdots\geq \sigma_r(A)> 0$. The pair of linear spaces $(S_1,S_2)$ are $S_1=span\{u_1,...,u_r\}$ and $S_2=span\{v_1,...,v_r\}$. We denote $S_j^{\perp}$ the orthogonal complement of $S_j$, $j=1,2$, and by $P_S$ the projector on the linear subspace $S$ of $\mathbb{R}^{m_j}$, $j=1,2$.
The Schatten-$p$ (quasi-)norm $||A||_p$ of $A$ is \begin{align} ||A||_p = \left(\sum_{j=1}^{\min(m_1,m_2)}\sigma_j(A)^p\right)^{1/p}. \end{align}
Define the inner product $\langle A,B \rangle = tr(A^{\top}B)$.
Take as a known fact the trace duality property: \begin{align} |tr(A^{\top}B)|\leq ||A||_1||B||_{\infty}. \end{align} Also, suppose that we know that the subdifferential of the convex function $A\mapsto ||A||_1$ is \begin{align} \partial ||A||_1 = \left\{\sum_{j=1}^ru_jv_j^{\top}+P_{S_1^{\perp}}WP_{S_2^{\perp}}:||W||_{\infty}\leq 1\right\}. \end{align}
Here are the three questions:
First: Let $A,B\in\mathbb{R}^{m1\times m_2}$ and $V\in\partial||A||_{1}$ of the form $V=\sum_{j=1}^ru_jv_j^{\top}+P_{S_1^{\perp}}WP_{S_2^{\perp}}$. The authors say that by trace duality, there exist some $W$ with $||W||_{\infty}\leq 1$ such that \begin{align} \langle P_{S_1^{\perp}}WP_{S_2^{\perp}},B-A \rangle = \langle P_{S_1^{\perp}}WP_{S_2^{\perp}},B \rangle = \langle W,P_{S_1^{\perp}}BP_{S_2^{\perp}} \rangle = ||P_{S_1^{\perp}}BP_{S_2^{\perp}}||_1. \end{align} I understand perfectly the first two equalities, but I don't know why the last one is correct. Trace duality indicates an inequality. I think that they are using that there exists an specific and explicit for of $W$ such that we have equality in the trace duality property.
Second: Let $M\in\mathbb{R}^{m_1\times m_2}$ and $P_A(M) = M-P_{S_1^{\perp}}M P_{S_2^{\perp}}$. The authors say that $$||P_A(M)||_2\leq \sqrt{rank(P_A(M))}||M||_{\infty}.$$ I understand they use that $||A||_2\leq \sqrt{rank(A)}||A||_{\infty}$, but I think that there is a 2 missing in the inequality. They give the help $P_A(M) = P_{S_1^{\perp}}MP_{S_2}+P_{S_1}M$ and $rank(P_S)\leq rank(A)$.
Third: Let $A$ and $B$ as in First. The authors say that $||P_{S_1}(B-A)P_{S_2}||_2 \leq ||B-A||_2$. Maybe they used the submultiplicity of Schatten norms, but $||P_{S_j}|| \leq \sqrt{r}$.
I know that it is a lot of information, so thanks in advance. I do not expect complete answers for all three. Instead, I will appreciate some insight into at least one of the questions.