Convexity of $\mathrm{trace}(S) + m^2\mathrm{trace}(S^{-2})$

519 Views Asked by At

I have the following function $f(S)=\mathrm{trace}(S)+m^2\mathrm{trace}(S^{-2})$ where $S\in \mathcal{M}_{m,m}$ symmetric positive definite matrix. I'm trying to prove the convexity of this function and so I'm wondering how to show properly the convexity of $f(S)$.

1

There are 1 best solutions below

7
On BEST ANSWER

Let $S_1$, $S_2$ be two positive definite matrices. Let $\Delta = S_2 - S_1$ and for $t \in [0,1]$, let $$\phi = (S_1 + t\Delta)^{-1} = ((1 - t) S_1 + t S_2)^{-1}$$ We have:

$$\begin{align} & \frac{d}{dt} \phi \;= - \phi \Delta \phi\\ \implies & \frac{d}{dt} \phi^2 \;= - \phi \Delta \phi^2 - \phi^2 \Delta \phi\\ \implies & \frac{d^2}{dt^2} \phi^2 = ( \phi \Delta \phi ) \Delta \phi^2 + \phi \Delta ( \phi \Delta \phi^2 + \phi^2 \Delta \phi ) + ( \phi \Delta \phi^2 + \phi^2 \Delta \phi ) \Delta \phi + \phi^2 \Delta ( \phi \Delta \phi ) \end{align}$$ Taking trace on both side, we get $$\begin{align} \frac{d^2}{dt^2} \operatorname{tr}(\phi^2) &= 2\operatorname{tr}( (\phi\Delta)^2\phi^2 + (\phi\Delta\phi)^2 + \phi^2 (\Delta\phi)^2)\\ &= 2\operatorname{tr}\left( 2 (\sqrt{\phi}\Delta\sqrt{\phi}^3)^T(\sqrt{\phi}\Delta\sqrt{\phi}^3) + (\phi\Delta\phi)^T(\phi\Delta\phi)\right)\ge 0\tag{*} \end{align}$$

Notice for any $t \in [0,1]$, $\phi$ is invertible. This means $\phi\Delta\phi$ is non-zero and hence the R.H.S of $(*)$ is actually positive. As a result,

$$\frac{d^2}{dt^2}\operatorname{tr}\left(((1-t)S_1 + t S_2) + m^2((1 - t) S_1 + t S_2)^{-2}\right) > 0 $$

over $[0,1]$ and hence $\operatorname{tr}(S+m^2 S^{-2})$ is convex over the space of positive definite matrices.

Update

Thinking more about this, it might be cleaner to prove $\operatorname{tr}(S^{-n})$ is convex for all $n \ge 1$ at once.

Let $\psi(t) = S_1 + t\Delta$ and for any $\lambda > 0$, let $Z_{\lambda}(t) = \operatorname{tr}(e^{-\lambda \psi(t)})$, we have:

$$\begin{align} \frac{d}{dt}Z_{\lambda}(t) &= \operatorname{tr}\left( \int_0^1 ds\;e^{-\lambda s\psi(t)}( -\lambda\Delta )e^{-\lambda(1-s)\psi(t)}\right)\\ &= -\lambda \operatorname{tr}\left(e^{-\lambda\psi(t)}\Delta\right)\\ \implies \frac{d^2}{dt^2}Z_{\lambda}(t) &= \lambda^2 \operatorname{tr}\left(\int_0^1 ds\;e^{-\lambda s\psi(t)}\Delta e^{-\lambda(1-s)\psi(t)}\Delta\right)\\ &= \lambda^2 \int_0^1 ds \operatorname{tr}\left( ( e^{-\frac{\lambda s}{2}\psi(t)}\Delta e^{-\frac{\lambda(1-s)}{2}\psi(t)} )^T ( e^{-\frac{\lambda s}{2}\psi(t)}\Delta e^{-\frac{\lambda(1-s)}{2}\psi(t)} ) \right)\\ &> 0 \end{align}$$

So for any $n \ge 1$, we have:

$$\frac{d^2}{dt^2} \operatorname{tr}( \psi(t)^{-n} ) =\frac{d^2}{dt^2} \operatorname{tr}\left(\int_0^{\infty}\frac{\lambda^{n-1}}{n!} e^{-\lambda\psi(t)} d\lambda\right) = \frac{1}{n!}\int_0^{\infty} \lambda^{n-1} \frac{d^2Z_{\lambda}(t)}{dt^2} d\lambda > 0 $$