Subdifferential of $a^\text{T}x+\alpha\sqrt{x^\text{T}Bx}$ at $x=0$

467 Views Asked by At

I would like to compute the subdifferential of the function

$$ f(x)=a^\text{T}x+\alpha\sqrt{x^\text{T}Bx} $$ where $\alpha>0$ and $B$ is symmetric positive definite.

Attempted Solution (I am brand new to subdifferentiability)

Since subderivatives, like normal derivatives, are linear, we can compute term-by-term. Since the first term is differentiable, we're really interested in computing the subdifferential of

$$ \sqrt{x^\text{T}Bx} $$

at $x=0$. This has been asked in the question here, and I gather that the subdifferential should be

$$ \{z:\sqrt{z^\text{T}{B}z}\leqslant\lambda_\text{min}\} $$

where $\lambda_\text{min}$ is the smallest eigenvalue of $B$. All together, the subdifferential is

$$ \{a+\alpha z:\sqrt{z^\text{T}Bz}\leqslant\lambda_\text{min}\} $$

Is this correct? If so, can anyone elucidate why the subdifferential of $\sqrt{z^\text{T}Bz}$ is what it is? The linked question confused me.

2

There are 2 best solutions below

8
On BEST ANSWER

The subdifferential of $f$ at $x$ is the set of $g$ that satisfy $f(y)-f(x) \ge g^T(y-x)$ for all $y$ in the domain. For convex functions, this is the set of supporting hyperplanes to $\operatorname{epi} f$ at $x$.

In the following, $\overline{B}(0,1)$ denotes the closed Euclidean unit ball.

Note that if $f$ is a norm, we have $f(0) = 0$.

Note for the Euclidean norm that $\|y\|_2 \ge g^T y$ for all $y$ iff $\|g\|_2 \le 1$. If you look at the epigraph of the norm you will see a satisfying geometric explanation.

Since $B$ is symmetric positive definite, it has a square root $\sqrt{B}$, and we can write $\sqrt{ x^T B x} = \|\sqrt{B} x\|_2$. Let me write $n(x) = \sqrt{ x^T B x}$ for convenience.

Then $n(y) = \sqrt{ y^T B y} = \|\sqrt{B} y\|_2 \ge g^T y = ((\sqrt{B})^{-1}g)^T \sqrt{B}y$, so we see that $n(y) \ge g^T y$ for all $y$ iff $\|(\sqrt{B})^{-1}g \| \le 1$. Or, equivalently you can write this as $\sqrt{B} \, \overline{B}(0,1)$. That is, $\partial n(0) = \sqrt{B}\, \overline{B}(0,1)$.

It is straightforward to see that if $f$ has a subdifferential $\partial f(x)$ at $x$ then, if $\alpha >0$, we see that $\alpha f$ has a subdifferential $\alpha \partial f(x)$ at $x$

If $f$ is continuously differentiable at $x$, then ${\partial f(x)} = \{ \nabla f(x) \}$, so we see that if $f(x) = a^T x$, then $\partial f(x) = \{ a \}$.

Finally, if $f_k$ (a finite number of) have subdifferentials $\partial f_k(x)$ at $x$, then (I am assuming that the $f_k$ are finite valued everywhere) $\partial (\sum_k f_k)(x) = \sum_k \partial f_k(x)$.

Combining all of these, we get $\partial f(x) = \{a\}+ \alpha \sqrt{B} \, \overline{B}(0,1)$.

Alternative: Another more technical approach is to note that $f$ is continuously differentiable everywhere except at $x=0$, so we can compute $\partial f (x) = \operatorname{co}\{ \xi | \xi \text{ is a limit point of } \nabla f(x_k), \text{ where } x_k \to 0, x_k \neq 0 \}$.

If $x \neq 0$ we can compute $\nabla f(x) = a + \alpha {(\sqrt{B} x)^T \over \|\sqrt{B} x\|_2 } \sqrt{B}$, and it is straightforward to see that the collection of accumulation points (taken as $x \to 0$) are $\{a\}+ \alpha \sqrt{B} \, \overline{B}(0,1)$.

Elaboration: This is an marginally different derivation of the formula for $\partial n(0)$.

We are looking for $g$ that satisfy $n(y)-n(0) = n(y) \ge g^T y$ for all $y$.

Note that $\sqrt{B}$ is symmetric, and $n(y) = \sqrt{y^T B y } = \sqrt{y^T \sqrt{B}\sqrt{B} y } = \|\sqrt{B} y\|_2$

Since $\sqrt{B}$ is invertible, note that $\{y|y \in \mathbb{R}^n\} = \{ (\sqrt{B})^{-1} y | y \in \mathbb{R}^n \}$, so we can write the subgradient characterisation as $n((\sqrt{B})^{-1} y) \ge g^T (\sqrt{B})^{-1} y = ((\sqrt{B})^{-1} g)^T y$ for all $y$.

Substituting the previous expansion of $n(y)$ we get that $g$ is a subgradient of $n$ iff $\|y\|_2 \ge ((\sqrt{B})^{-1} g)^T y$ for all $y$.

From the $3$rd paragraph above, we see that $g$ is a subgradient iff $\|(\sqrt{B})^{-1} g\|_2 \le 1$.

Then $\partial n(0) = \{g | \|(\sqrt{B})^{-1} g\|_2 \le 1 \} = \{ \sqrt{B} g' | \|g'\|_2 \le 1 \} = \sqrt{B} \overline{B}(0,1)$.

1
On

Hint: Find matrix $P$ and $A$ such that $B = P A P^{T}$. (see here for details ) Then using this express the term $\sqrt{x^\text{T}Bx}$ in terms of Euclidean norm i.e,

$$ \sqrt{x^\text{T}Bx} = \| (P A^{\frac{1}{2}} )^T x \| $$

Now use chain rule for subdifferentials. (note that the matrix $( P A^{\frac{1}{2}} )^T $ has full rank.

Note that $\partial \|.\| (0) =B$ (the unite closed ball)