Second derivative of the nuclear norm

1.1k Views Asked by Bumbble Comm At 10 May 2026 - 6:03

The nuclear norm is defined in the following way

$$\| X \|_* := \mbox{tr} \left( \sqrt{X^T X} \right)$$

and, from Derivative of the nuclear norm with respect to its argument,

$$\frac{d}{dX} \| X \|_* = U\Sigma^{-1}\mid\Sigma \mid V^T$$

What is the second derivative of the nuclear norm?

$$\frac{d^2}{dX^2} \| X \|_* = ?$$

I need it to compute Newton's method for my algorithm and I haven't had much success. Any help would greatly be appreciated. Thanks in advance!

Original Q&A

There are 1 best solutions below

Bumbble Comm On 11 Feb 2017 - 5:34

The reason you haven't had any success is that the nuclear norm is not differentiable. Your answer for the derivative is only partly correct. If you are in the positive definite cone your answer is correct and reduces to the identity matrix $I$, but if the matrix $X$ is low rank then the function is not differentiable and the best you can do is to compute the sub-differential $$ \partial \|X\|_* = \left\{UV^\top+W: U^TW = WV = 0, \quad\|W\|_2\right\}, $$ where $\|W\|_2=\max eig(W)$ is the spectral norm which is the maximum eigenvalue of W.

Since the nuclear norm is a norm and all norms are convex we can talk about the sub-differential which is the set of all tangents that lie below the function. The proof of the result above can be found in this paper:

Watson, G. A. Characterization of the subdifferential of some matrix norms. Linear Algebra and its Applications, 170:33– 45, 1992.

From the expression above you can see that the nuclear norm is piecewise linear on each of the cones. The second derivative would therefore be zero on each of the differentiable pieces. That should explain why Newton's algorithm doesn't work so well (actually it depends on what else is in your objective function Another way to look at it is that all norms look like $|x|$ when you take one dimensional slices, and $|x|$ is piecewise linear with second derivative 0 except for at $x=0$ where it is not differentiable.

If you really want to use Newton's approach you should use the variational formulation that says that $$ \|X\|_* = \min_{L,R:LR^\top = X} \frac{1}{2}\left(\|L\|_F^2+\|R\|_F^2\right) $$ then minimizing a function like $$ f(X)+\|X\|_* = \min_{L,R} f(LR^\top)+ \frac{1}{2}\left(\|L\|_F^2+\|R\|_F^2\right) $$ can be done by alternating minimization over $L$ and $R$, where these problems are continuous if $f$ is continuous.

Second derivative of the nuclear norm

There are 1 best solutions below

Related Questions in LINEAR-ALGEBRA

Related Questions in DERIVATIVES

Related Questions in OPTIMIZATION

Related Questions in MATRIX-CALCULUS

Related Questions in NUCLEAR-NORM

Trending Questions

Popular # Hahtags

Popular Questions