Derivative of $\|X-\alpha Y\|_2$ with respect to $\alpha$.

128 Views Asked by At

Let $X$ and $Y$ be operators on a real or complex Hilbert space $\mathcal{H}$ and $f(\alpha) = \|X - \alpha Y\|_2$ where $\alpha$ is real and $\|A\|_2 = \sigma_{\mathsf{max}}(A)$ is the $\ell^2$-induced operator norm. What is $\frac{df}{d\alpha}$?

Even if the function is not differentiable everywhere, $f$ is convex in which case a sub-gradient will suffice.

Also, if it helps we can assume $X=I$ and $Y$ is positive definite but I'd rather see a more general result. Also considering $f^2$ instead of $f$ is also fine if that helps.

Plots of $f$: I ran two simple numerical examples which might be enlightening. In the following plot 1, $X = I\in M_{50}(\mathbb{R})$ and $Y = Z^\mathsf{T}Z + I$ where $Z_{ij}\sim\mathcal{N}(0,1)$ is normally distributed. As we can see, the plot seems piecewise linear. plot of f using

In the next plot 2, we take $X_{ij}\sim\mathcal{N}(0,1) - I \in M_{50}(\mathbb{R})$ and $Y_{ij}\sim\mathcal{N}(0,1)$. Note that neither $X$ nor $Y$ are symmetric. This example looks differentiable and practically quadratic.enter image description here

1

There are 1 best solutions below

2
On BEST ANSWER

There's at least a subset of cases here where we can do this. Note that for any norm $\|\cdot\|$, $$\|W\| = \sup_{\|Z\|_* \leq 1} \langle W, Z \rangle$$ where $\|\cdot\|_*$ is the dual norm. For the case of the matrix spectral norm, the dual norm is the nuclear norm (the sum of the singular values). Any maximizing value of $Z$ above is a subgradient of the norm at that point. That is, let $Z^*$ be any point satisfying $$\|Z\|_* = 1, \quad \langle W, Z \rangle = \|W\|.$$ then $$\|W + \delta W\| \geq \langle W + \delta W, Z^* \rangle = \langle W, Z^* \rangle + \langle \delta W, Z^* \rangle = \|W\| + \langle Z^*, \delta W \rangle$$ so $Z^* \in \partial \|W\|$. For the matrix spectral norm, valid values of $Z^*$ are readily obtained: if $W=U\Sigma V = \sum_i \sigma_i u_i v_i^H$ is the SVD of $W$, then $$\partial \|W\|_2 = \mathop{\textrm{Conv}}\{u_iv_i^H\,|\,\sigma_i=1\}.$$

Now in the case where $W=X-\alpha Y$ and $\delta W=-\alpha Y$, then, we have $$\|X-(\alpha+\delta \alpha)Y\| \geq \|X-\alpha Y\| + \langle Z^*, -\delta \alpha Y\rangle = \|X-\alpha Y\| - \delta\alpha \langle Z^*, Y \rangle$$ So $-\langle Z^*, Y \rangle \in \partial f(\alpha)$. For the matrix spectral norm, the subgradient can be obtained using the above SVD approach on $X-\alpha Y$ to obtain values of $Z^*$. So we have $$\partial f(\alpha) \subseteq \mathop{\textrm{Conv}}\{-\Re{v_i^TZu_i}\,|\,Z\in\partial \|X-\alpha Y\|_2\}.$$