I've been reading the paper "Nuclear-norm penalization and optimal rates for noisy low-rank matrix completion" by Koltchinskii et. al.
Let $\partial F(x)$ be the subdifferential of the convex function $F$. I understand that $$\partial ||A||^2_F=2A$$ whare $||A||_F$ is the Frobenius norm of the matrix $A$, and that $$\partial ||A||_*=\{\sum_{k=1}^ru_kv^{T}_k+P_u^{\perp}WP_{v}^{\perp}:||W||\leq 1\},$$ where $||A||_{*}$ is the nuclear norm of $A$, $A=\sum_{k=1}^r\sigma_ku_kv^{T}_k$ is the SVD of $A$, $P_u$ and $P_v$ are orthogonal proyectors and $||W||$ is the operator norm of the matrix $W$.
For some matrix $X$ define $F(A)=||X-A||_F^2+\lambda||A||_{*}$. In the article they say that $$\partial F(A)=\left\{2(X-A)+\lambda\left(\sum_{k=1}^ru_kv^{T}_k+P_u^{\perp}WP_{v}^{\perp}\right):||W||\leq 1\right\}.$$
If the subdifferential some kind of "linear function"? I don't understand why this is true.
Is there any good reference on subdifferentials and it's properties?
The standard reference probably is Rockafellar, R. Tyrrell (1997). Convex analysis. Princeton landmarks in mathematics.